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ETAPS Foreword 


Welcome to the 23rd ETAPS! This is the first time that ETAPS took place in Ireland in 
its beautiful capital Dublin. 

ETAPS 2020 was the 23rd instance of the European Joint Conferences on Theory 
and Practice of Software. ETAPS is an annual federated conference established in 
1998, and consists of four conferences: ESOP, FASE, FoSSaCS, and TACAS. Each 
conference has its own Program Committee (PC) and its own Steering Committee 
(SC). The conferences cover various aspects of software systems, ranging from 
theoretical computer science to foundations of programming language developments, 
analysis tools, and formal approaches to software engineering. Organizing these 
conferences in a coherent, highly synchronized conference program enables researchers 
to participate in an exciting event, having the possibility to meet many colleagues 
working in different directions in the field, and to easily attend talks of different 
conferences. On the weekend before the main conference, numerous satellite 
workshops took place that attracted many researchers from all over the globe. Also, for 
the second time, an ETAPS Mentoring Workshop was organized. This workshop is 
intended to help students early in the program with advice on research, career, and life 
in the fields of computing that are covered by the ETAPS conference. 

ETAPS 2020 received 424 submissions in total, 129 of which were accepted, 
yielding an overall acceptance rate of 30.4%. I thank all the authors for their interest in 
ETAPS, all the reviewers for their reviewing efforts, the PC members for their 
contributions, and in particular the PC (co-)chairs for their hard work in running this 
entire intensive process. Last but not least, my congratulations to all authors of the 
accepted papers! 

ETAPS 2020 featured the unifying invited speakers Scott Smolka (Stony Brook 
University) and Jane Hillston (University of Edinburgh) and the conference-specific 
invited speakers (ESOP) Isil Dillig (University of Texas at Austin) and (FASE) Willem 
Visser (Stellenbosch University). Invited tutorials were provided by Erika Abraham 
(RWTH Aachen University) on the analysis of hybrid systems and Madhusudan 
Parthasarathy (University of Illinois at Urbana-Champaign) on combining Machine 
Learning and Formal Methods. On behalf of the ETAPS 2020 attendants, I thank all the 
speakers for their inspiring and interesting talks! 

ETAPS 2020 took place in Dublin, Ireland, and was organized by the University of 
Limerick and Lero. ETAPS 2020 is further supported by the following associations and 
societies: ETAPS e.V., EATCS (European Association for Theoretical Computer 
Science), EAPLS (European Association for Programming Languages and Systems), 
and EASST (European Association of Software Science and Technology). The local 
organization team consisted of Tiziana Margaria (general chair, UL and Lero), 
Vasileios Koutavas (Lero@UCD), Anila Mjeda (Lero@UL), Anthony Ventresque 
(Lero@ UCD), and Petros Stratis (Easy Conferences). 
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The ETAPS Steering Committee (SC) consists of an Executive Board, and 
representatives of the individual ETAPS conferences, as well as representatives of 
EATCS, EAPLS, and EASST. The Executive Board consists of Holger Hermanns 
(Saarbrücken), Marieke Huisman (chair, Twente), Joost-Pieter Katoen (Aachen and 
Twente), Jan Kofron (Prague), Gerald Liittgen (Bamberg), Tarmo Uustalu (Reykjavik 
and Tallinn), Caterina Urban (Inria, Paris), and Lenore Zuck (Chicago). 

Other members of the SC are: Armin Biere (Linz), Jordi Cabot (Barcelona), Jean 
Goubault-Larrecq (Cachan), Jan-Friso Groote (Eindhoven), Esther Guerra (Madrid), 
Jurriaan Hage (Utrecht), Reiko Heckel (Leicester), Panagiotis Katsaros (Thessaloniki), 
Stefan Kiefer (Oxford), Barbara König (Duisburg), Fabrice Kordon (Paris), Jan 
Kretinsky (Munich), Kim G. Larsen (Aalborg), Tiziana Margaria (Limerick), Peter 
Müller (Zurich), Catuscia Palamidessi (Palaiseau), Dave Parker (Birmingham), 
Andrew M. Pitts (Cambridge), Peter Ryan (Luxembourg), Don Sannella (Edinburgh), 
Bernhard Steffen (Dortmund), Mariélle Stoelinga (Twente), Gabriele Taentzer 
(Marburg), Christine Tasson (Paris), Peter Thiemann (Freiburg), Jan Vitek (Prague), 
Heike Wehrheim (Paderborn), Anton Wijs (Eindhoven), and Nobuko Yoshida 
(London). 

I would like to take this opportunity to thank all speakers, attendants, organizers 
of the satellite workshops, and Springer for their support. I hope you all enjoyed 
ETAPS 2020. Finally, a big thanks to Tiziana and her local organization team for all 
their enormous efforts enabling a fantastic ETAPS in Dublin! 


February 2020 Marieke Huisman 
ETAPS SC Chair 
ETAPS e.V. President 


Preface 


This volume contains the papers presented at the 23rd International Conference on 
Foundations of Software Science and Computation Structures (FoSSaCS), which took 
place in Dublin, Ireland, during April 27—30, 2020. The conference series is dedicated 
to foundational research with a clear significance for software science. It brings 
together research on theories and methods to support the analysis, integration, syn- 
thesis, transformation, and verification of programs and software systems. 

This volume contains 31 contributed papers selected from 98 full paper submis- 
sions, and also a paper accompanying an invited talk by Scott Smolka (Stony Brook 
University, USA). Each submission was reviewed by at least three Program Committee 
members, with the help of external reviewers, and the final decisions took into account 
the feedback from a rebuttal phase. The conference submissions were managed using 
the EasyChair conference system, which was also used to assist with the compilation 
of these proceedings. 

We wish to thank all the authors who submitted papers to FoSSaCS 2020, the 
Program Committee members, the Steering Committee members and the external 
reviewers. In addition, we are grateful to the ETAPS 2020 Organization for providing 
an excellent environment for FoSSaCS 2020 alongside the other ETAPS conferences 
and workshops. 
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Neural Flocking: MPC-based Supervised 
Learning of Flocking Controllers 


(&)Usama Mehmood!, Shouvik Roy', Radu Grosu?, Scott A. Smolkat, 
Scott D. Stoller!, and Ashish Tiwari? 


1 Stony Brook University, Stony Brook NY, USA 
umehmood@cs.stonybrook. edu 
? Technische Universitat Wien, Wien, Austria 
3 Microsoft Research, San Francisco CA, USA 


Abstract. We show how a symmetric and fully distributed flocking con- 
troller can be synthesized using Deep Learning from a centralized flocking 
controller. Our approach is based on Supervised Learning, with the cen- 
tralized controller providing the training data, in the form of trajectories 
of state-action pairs. We use Model Predictive Control (MPC) for the cen- 
tralized controller, an approach that we have successfully demonstrated 
on flocking problems. MPC-based flocking controllers are high-performing 
but also computationally expensive. By learning a symmetric and dis- 
tributed neural flocking controller from a centralized MPC-based one, 
we achieve the best of both worlds: the neural controllers have high 
performance (on par with the MPC controllers) and high efficiency. Our 
experimental results demonstrate the sophisticated nature of the dis- 
tributed controllers we learn. In particular, the neural controllers are 
capable of achieving myriad flocking-oriented control objectives, includ- 
ing flocking formation, collision avoidance, obstacle avoidance, predator 
avoidance, and target seeking. Moreover, they generalize the behavior 
seen in the training data to achieve these objectives in a significantly 
broader range of scenarios. In terms of verification of our neural flock- 
ing controller, we use a form of statistical model checking to compute 
confidence intervals for its convergence rate and time to convergence. 


Keywords: Flocking - Model Predictive Control - Distributed Neural Controller 
- Deep Neural Network - Supervised Learning 


1 Introduction 


With the introduction of Reynolds rule-based model (i6]/17], it is now possible 
to understand the flocking problem as one of distributed control. Specifically, in 
this model, at each time-step, each agent executes a control law given in terms 
of the weighted sum of three competing forces to determine its next acceleration. 
Each of these forces has its own rule: separation (keep a safe distance away 
from your neighbors), cohesion (move towards the centroid of your neighbors), 
and alignment (steer toward the average heading of your neighbors). Reynolds 


© The Author(s) 2020 
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Fig. 1: Neural Flocking Architecture 


controller is distributed; i.e., it is executed separately by each agent, using 
information about only itself and nearby agents, and without communication. 
Furthermore, it is symmetric; i.e., every agent runs the same controller (same 
code). 

We subsequently showed that a simpler, more declarative approach to the 
flocking problem is possible a. In this setting, flocking is achieved when the 
agents combine to minimize a system-wide cost function. We presented centralized 
and distributed solutions for achieving this form of “declarative flocking” (DF), 
both of which were formulated in terms of Model-Predictive Control (MPC) p]. 

Another advantage of DF over the ruled-based approach exemplified by 
Reynolds model is that it allows one to consider additional control objectives 
(e.g., obstacle and predator avoidance) simply by extending the cost function 
with additional terms for these objectives. Moreover, these additional terms are 
typically quite straightforward in nature. In contrast, deriving behavioral rules 
that achieve the new control objectives can be a much more challenging task. 

An issue with MPC is that computing the next control action can be compu- 
tationally expensive, as MPC searches for an action sequence that minimizes the 
cost function over a given prediction horizon. This renders MPC unsuitable for 
real-time applications with short control periods, for which flocking is a prime 
example. Another potential problem with MPC-based approaches to flocking is 
its performance (in terms of achieving the desired flight formation), which may 
suffer in a fully distributed setting. 

In this paper, we present Neural Flocking (NF), a new approach to the 
flocking problem that uses Supervised Learning to learn a symmetric and fully 
distributed flocking controller from a centralized MPC-based controller. By doing 
so, we achieve the best of both worlds: high performance (on par with the MPC 
controllers) in terms of meeting flocking flight-formation objectives, and high 
efficiency leading to real-time flight controllers. Moreover, our NF controllers can 
easily be parallelized on hardware accelerators such as GPUs and TPUs. 

Figure [I] gives an overview of the NF approach. A high-performing centralized 
MPC controller provides the labeled training data to the learning agent: a 
symmetric and distributed neural controller in the form of a deep neural network 
(DNN). The training data consists of trajectories of state-action pairs, where a 
state contains the information known to an agent at a time step (e.g., its own 
position and velocity, and the position and velocity of its neighbors), and the 
action (the label) is the acceleration assigned to that agent at that time step by 
the centralized MPC controller. 

We formulate and evaluate NF in a number of essential flocking scenarios: 
basic flocking with inter-agent collision avoidance, as in u], and more advanced 
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scenarios with additional objectives, including obstacle avoidance, predator avoid- 
ance, and target seeking by the flock. We conduct an extensive performance 
evaluation of NF. Our experimental results demonstrate the sophisticated nature 
of NF controllers. In particular, they are capable of achieving all of the stated 
control objectives. Moreover, they generalize the behavior seen in the training 
data in order to achieve these objectives in a significantly broader range of scenar- 
ios. In terms of verification of our neural controller, we use a form of statistical 
model checking to compute confidence intervals for its rate of convergence 
to a flock and for its time to convergence. 


2 Background 


We consider a set of n dynamic agents A = {1,...,n} that move according to 
the following discrete-time equations of motion: 
pilk +1) = pi(k) + dt- vi(k), |vi(k)| < 
vilk +1) = v;(k) + dt - a;(k), \a;(k)| <a 


ised 


(1) 


where p(k) € R?, vi(k) € R?, a;(k) € R? are the position, velocity and accelera- 
tion of agent i € A respectively at time step k, and dt € R* is the time step. The 
magnitudes of velocities and accelerations are bounded by y and 4G, respectively. 
Acceleration a;(k) is the control input for agent i at time step k. The acceleration 
is updated after every 7 time steps i.e., n- dt is the control period. The flock 
configuration at time step k is thus given by the following vectors (in boldface): 


p(k) = [pt (k) +» pp (KY)? (2) 
v(k) = [vt (k) +- un (kD)? (3) 
a(k) = [a1 (k) -+ - ap (k)]” (4) 


The configuration vectors are referred to without the time indexing as p, 
v, and a. The neighborhood of agent i at time step k, denoted by M;(k) C A, 
contains its M-nearest neighbors, i.e., the M other agents closest to it. We use 
this definition (in Section [2.2] to define a distributed-flocking cost function) for 
simplicity, and expect that a radius-based definition of neighborhood would lead 
to similar results for our distributed flocking controllers. 


2.1 Model-Predictive Control 


Model-Predictive control (MPC) [2| is a well-known control technique that has 
recently been applied to the flocking problem [19] 20]. At each control step, 
an optimization problem is solved to find the optimal sequence of control actions 
(agent accelerations in our case) that minimizes a given cost function with respect 
to a predictive model of the system. The first control action of the optimal control 
sequence is then applied to the system; the rest is discarded. In the computation 
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of the cost function, the predictive model is evaluated for a finite prediction 
horizon of T control steps. 

MPC-based flocking models can be categorized as centralized or distributed. A 
centralized model assumes that complete information about the flock is available 
to a single “global” controller, which uses the states of all agents to compute 
their next optimal accelerations. The following optimization problem is solved by 
a centralized MPC controller at each control step k: 


T-1 
i J(k) +2- k+t|k)|/? 5 
a(k|k),...a(kP—1|k) <a ( ) > lla | )I| ( ) 


The first term J(k) is the centralized model-specific cost, evaluated for T control 
steps (this embodies the predictive aspect of MPC), starting at time step k. It 
encodes the control objective of minimizing the cost function J(k). The second 
term, scaled by a weight A > 0, penalizes large control inputs: a(k + t | k) are 
the predictions made at time step k for the accelerations at time step k + t. 

In distributed MPC, each agent computes its acceleration based only on its 
own state and its local knowledge, e.g., information about its neighbors: 


T-1 


Ti(k) +A- $. lla:(k +t | RDI? (6) 


ai(k|k),.. TRET- 1ļk)<ā mr 


Ji(k) is the distributed, model-specific cost function for agent i, analogous to J(k). 
In a distributed setting where an agent’s knowledge of its neighbors’ behavior 
is limited, an agent cannot calculate the exact future behavior of its neighbors. 
Hence, the predictive aspect of J;(k) must rely on some assumption about 
that behavior during the prediction horizon. Our distributed cost functions are 
based on the assumption that the neighbors have zero accelerations during the 
prediction horizon. While this simple design is clearly not completely accurate, 
our experiments show that it still achieves good results. 


2.2 Declarative Flocking 


Declarative flocking (DF) is a high-level approach to designing flocking algorithms 
based on defining a suitable cost function for MPC g]. This is in contrast to the 
operational approach, where a set of rules are used to capture flocking behavior, 
as in Reynolds model. For basic flocking, the DF cost function contains two terms: 
(1) a cohesion term based on the squared distance between each pair of agents in 
the flock; and (2) a separation term based on the inverse of the squared distance 
between each pair of agents. The flock evolves toward a configuration in which 
these two opposing forces are balanced. The cost function J© for centralized DF, 
i.e., centralized MPC (CMPC), is as follows: 


i€CA GEA i<j 


J? (p) = 
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where w, is the weight of the separation term and controls the density of the flock. 
The cost function is normalized by the number of pairs of agents, eet 
as such, the cost does not depend on the size of the flock. The control law for 
CMPC is given by Eq. (5), with J(k) = 27, Jo (p(k +t | k)). 

The basic flocking cost function for distributed DF is similar to that for 
CMPC, except that the cost function J? for agent i is computed over its set of 
neighbors N;(k) at time k: 


POM) =p È lotu So (8) 


. 12 
MO a, a Pal 


The control law for agent i is given by Eq. (sh, with Ji(k) = Z7 JP (p(k +t |k)). 


3 Additional Control Objectives 


The cost functions for basic flocking given in Eqs. and are designed to 
ensure that in the steady state, the agents are well-separated. Additional goals 
such as obstacle avoidance, predator avoidance, and target seeking are added 
to the MPC formulation as weighted cost-function terms. Different objectives 
can be combined by including the corresponding terms in the cost function as a 
weighted sum. 


Cost-Function Term for Obstacle Avoidance. We consider multiple rectangular 
obstacles which are distributed randomly in the field. For a set of m rectangular 
obstacles O = {01, O2, ..., Om}, we define the cost function term for obstacle 


avoidance as: 


ieA jeO J-P p= 6 


where o is the set of points on the obstacle boundaries and of is the point on 
the obstacle boundary of the j‘” obstacle O; that is closest to the it agent. 


Cost-Function Term for Target Seeking. This term is the average of the squared 
distance between the agents and the target. Let g denote the position of the fixed 
target. Then the target-seeking term is as defined as 


Jrs(p) = = J lip: — all? (10) 
M icA 


Cost-Function Term for Predator Avoidance. We introduce a single predator, 
which is more agile than the flocking agents: its maximum speed and acceleration 
are a factor of fp greater than 0 and ā, respectively, with fp > 1. Apart from 
being more agile, the predator has the same dynamics as the agents, given by 
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Eq. (ip. The control law for the predator consists of a single term that causes it 
to move toward the centroid of the flock with maximum acceleration. 

For a flock of n agents and one predator, the cost-function term for predator 
avoidance is the average of the inverse of the cube of the distances between the 
predator and the agents. It is given by: 


1 1 
Jpa (P, Ppred) = (11) 
P |A| = Pi — Ppreall” 


where Pprea is the position of the predator. In contrast to the separation term 
in Eqs. (5)-(6), which we designed to ensure inter-agent collision avoidance, the 
predator-avoidance term has a cube instead of a square in the denominator. This 
is to reduce the influence of the predator on the flock when the predator is far 
away from the flock. 


NF Cost-Function Terms. The MPC cost functions used in our examination of 
Neural Flocking are weighted sums of the cost function terms introduced above. 
We refer to the first term of our centralized DF cost function J? (p) (see Eq. Eh) 
as Jeohes(p) and the second as Jsep(p). We use the following cost functions J, 
J2, and J3 for basic flocking with collision avoidance, obstacle avoidance with 
target seeking, and predator avoidance, respectively. 


Jy (p) = Jeohes (p) T Ws: J sep (p) (12a) 
J2(p, o) = Jeohes(P) T Ws: Jsep(P) T Wo' Joa(p, o) +w: Jrs(p) (12b) 
J3(P, Ppred) = Jeohes(P) T Ws: Jsep(P) T Wy: JPA(P, Ppred) (12c) 


where w, is the weight of the separation term, w, is the weight of the obstacle 
avoidance term, w+ is the weight of the target-seeking term, and wp is the weight 
of the predator-avoidance term. Note that Jı is equivalent to J? (Eq. Ep). The 
weight ws of the separation term is experimentally chosen to ensure that the 
distance between agents, throughout the simulation, is at least dmin, the minimum 
inter-agent distance representing collision avoidance. Similar considerations were 
given to the choice of values for wọ and wp. The specific values we used for the 
weights are: ws = 2000, wo = 1500, w, = 10, and wp = 500. 

We experimented with an alternative strategy for introducing inter-agent 
collision avoidance, obstacle avoidance, and predator avoidance into the MPC 
problem, namely, as constraints of the form dmin — Pij < 0, dmin — ||pi — 
of || < 0, and dmin — ||Pi — Pprea|| < 0, respectively. Using the theory of exact 
penalty functions (12], we recast the constrained MPC problem as an equivalent 
unconstrained MPC problem by converting the constraints into a weighted 
penalty term, which is then added to the MPC cost function. This approach 
rendered the optimization problem difficult to solve due to the non-smoothness 
of the penalty term. As a result, constraint violations in the form of collisions 
were observed during simulation. 
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4 Neural Flocking 


We learn a distributed neural controller (DNC) for the flocking problem using 
training data in the form of trajectories of state-action pairs produced by a CMPC 
controller. In addition to basic flocking with inter-agent collision avoidance, the 
DNC exhibits a number of other flocking-related behaviors, including obstacle 
avoidance, target seeking, and predator avoidance. We also show how the learned 
behavior exhibited by the DNC generalizes over a larger number of agents than 
what was used during training to achieve successful collision-free flocking in 
significantly larger flocks. 

We use Supervised Learning to train the DNC. Supervised Learning learns a 
function that maps an input to an output based on example sequences of input- 
output pairs. In our case, the trajectory data obtained from CMPC contains both 
the training inputs and corresponding labels (outputs): the state of an agent in 
the flock (and that of its nearest neighbors) at a particular time step is the input, 
and that agent’s acceleration at the same time step is the label. 


4.1 Training Distributed Flocking Controllers 


We use Deep Learning to synthesize a distributed and symmetric neural controller 
from the training data provided by the CMPC controller. Our objective is to learn 
basic flocking, obstacle avoidance with target seeking, and predator avoidance. 
Their respective CMPC-based cost functions are given in Sections 2-2] and B] All 
of these control objectives implicitly also include inter-agent collision avoidance 
by virtue of the separation term in Eq. 

For each of these control objectives, DNC training data is obtained from 
CMPC trajectory data generated for n = 15 agents, starting from initial con- 
figurations in which agent positions and velocities are uniformly sampled from 
[—15, 15]? and [0, 1]?, respectively. All training trajectories are 1,000 time steps 
in duration. 

We further ensure that the initial configurations are recoverable; i.e., no two 
agents are so close to each other that they cannot avoid a collision by resorting 
to maximal accelerations. We learn a single DNC from the state-action pairs of 
all n agents. This yields a symmetric distributed controller, which we use for 
each agent in the flock during evaluation. 


Basic Flocking. Trajectory data for basic flocking is generated using the cost 
function given in Eq. (7). We generate 200 trajectories, each of which (as noted 
above) is 1,000 time steps long. The input to the NN is the position and velocity 
of each agent along with the positions and velocities of its M-nearest neighbors. 
This yields 200 - 1,000 - 15 = 3M total training samples. 

Let us refer to the agent (the DNC) being learned as Ap. Since we use 
neighborhood size M = 14, the input to the NN is of the form [p8 pg vg vë pt p? 
vt ui ... pts Dias Vis via], where põ, pg are the position coordinates and vg, vg 
velocity coordinates for agent Ap, and pf 44, P] 44 and vf 44, v/ 44 are the 
position and velocity vectors of its neighbors. Since this input vector has 60 
components, the input to the NN consists of 60 features. 
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(a) Basic flocking (b) Obstacle avoid. (c) Predator avoid. (d) Target seeking 


Fig. 2: Snapshots of DNC flocking behaviors for 30 agents 


Obstacle Avoidance with Target Seeking. For obstacle avoidance with target 
seeking, we use CMPC with the cost function given in Eq. {12b}. The target is 
located beyond the obstacles, forcing the agents to move through the obstacle 
field. For the training data, we generate 100 trajectories over 4 different obstacle 
fields (25 trajectories per obstacle field). The input to the NN consists of the 92 
features [pë pg vg ve og og... pia Dis vta via O14 O14 g” g”], where of, og is the 
closest point on any obstacle to agent Ao; of.14 , of 14 give the closest point on 
any obstacle for the 14 neighboring agents, and g7, g” is the target location. 


Predator Avoidance. The CMPC cost function for predator avoidance is given in 
Eq. (12ch. The position, velocity, and the acceleration of the predator are denoted 
by Ppreds Vpred; Gpred, respectively. We take fp = 1.40; hence Opreq = 1.400 and 
Gprea = 1.404. The input features to the NN are the positions and velocities 
of agent Ao and its N-nearest neighbors, and the position and velocity of the 
predator. The input with 64 features thus has the form [pë pg vg vë ... Pts pa 
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5 Experimental Evaluation 


This section contains the results of our extensive performance analysis of the 
distributed neural flocking controller (DNC), taking into account various control 
objectives: basic flocking with collision avoidance, obstacle avoidance with target 
seeking, and predator avoidance. As illustrated in Fig. |1| this involves running 
CMPC to generate the training data for the DNCs, whose performance we then 
compare to that of the DMPC and CMPC controllers. We also show that the 
DNC flocking controllers generalize the behavior seen in the training data to 
achieve successful collision-free flocking in flocks significantly larger in size than 
those used during training. Finally, we use Statistical Model Checking to obtain 
confidence intervals for DNC’s correctness/performance. 


5.1 Preliminaries 


The CMPC and DMPC control problems defined in Section 2.1are solved using 
MATLAB fmincon optimizer. In the training phase, the size of the flock is 
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n = 15. For obstacle-avoidance with target-seeking, we use 5 obstacles with the 
target located at [60,50]. The simulation time is 100, dt=0.1 time units, and 
n = 3, where (recall) 7 - dt is the control period. Further, the agent velocity and 
acceleration bounds are y = 2.0 and @=1.5. 

We use dmin = 1.5 as the minimum inter-agent distance for collision avoidance, 
ds = 1 as the minimum agent-obstacle distance for obstacle avoidance, and 
dre? = 1.5 as the minimum agent-predator distance for predator avoidance. For 
initial configurations, recall that agent positions and velocities are uniformly 
sampled from [—15, 15]? and [0,1]?, respectively, and we ensure that they are 
recoverable; i.e., no two agents are so close to each other that they cannot avoid 
a collision when resorting to maximal accelerations. The predator starts at rest 
from a fixed location at a distance of 40 from the flock center. 

For training, we considered 15 agents and 200 trajectories per agent, each 
trajectory 1,000 time steps in length. This yielded a total of 3,000,000 training 
samples. Our neural controller is a fully connected feed-forward Deep Neural 
Network (DNN), with 5 hidden layers, 84 neurons per hidden layer, and with a 
ReLU activation function. We use an iterative approach for choosing the DNN 
hyperparameters and architecture where we continuously improve our NN, until 
we observe satisfactory performance by the DNC. 

For training the DNNs, we use Keras (3), which is a high-level neural network 
API written in Python and capable of running on top of TensorF low. To generate 
the NN model, Keras uses the Adam optimizer |8| with the following settings: 
lr =107?, 8, =0.9, 82 =0.999, e=1078. The batch size (number of samples 
processed before the model is updated) is 2,000, and the number of epochs 
(number of complete passes through the training dataset) used for training is 
1,000. For measuring training loss, we use the mean-squared error metric. 

For basic flocking, DNN input vectors have 60 features and the number 
of trainable DNN parameters is 33,854. For flocking with obstacle-avoidance 
and target-seeking, input vectors have 92 features and the number of trainable 
parameters is 36,542. Finally, for flocking with predator-avoidance, input vectors 
have 64 features and the resulting number of trainable DNN parameters is 34,190. 

To test the trained DNC, we generated 100 simulations (runs) for each of the 
desired control objectives: basic flocking with collision avoidance, flocking with 
obstacle avoidance and target seeking, and flocking with predator avoidance. The 
results presented in Tables [I] were obtained using the same number of agents and 
obstacles and the same predator as in the training phase. We also ran tests that 
show DNC controllers can achieve collision-free flocking with obstacle avoidance 
where the numbers of agents and obstacles are greater than those used during 
training. 


5.2 Results for Basic Flocking 


We use flock diameter, inter-agent collision count and velocity convergence as 
performance metrics for flocking behavior. At any time step, the flock diameter 
D(p) = max(;,;)<4 ||pij|| is the largest distance between any two agents in the 
flock. We calculate the average converged diameter by averaging the flock diameter 
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Fig. 3: Performance comparison for basic flocking with collision avoidance, aver- 
aged over 100 test runs. 


in the final time step of the simulation over the 100 runs. An inter-agent collision 
(IC) occurs when the distance between two agents at any point in time is less than 
dmin- The IC rate (ICR) is the average number of ICs per test-trajectory time- 
step. The velocity convergence VC(v) = (1/n) (Dex lu; — (i v;)/nl[?) is 
the average of the squared magnitude of the discrepancy between the velocities of 
agents and the flock’s average velocity. For all the metrics, lower values are better, 
indicating a denser and more coherent flock with fewer collisions. A successful 
flocking controller should also ensure that values of D(p) and VC(v) eventually 
stabilize. 


Fig. B]and Table[1]compare the performance of the DNC on the basic-flocking 
problem for 15 agents to that of the MPC controllers. Although the DMPC and 
CMPC outperform the DNC, the difference is marginal. An important advantage 
of the DNC over DMPC is that they are much faster. Executing a DNC controller 
requires a modest number of arithmetic operations, whereas executing an MPC 
controller requires simulation of a model and controller over the prediction horizon. 
In our experiments, on average, the CMPC takes 1209 msec of CPU time for the 
entire flock and DMPC takes 58 msec of CPU time per agent, whereas the DNC 
takes only 1.6 msec. 


Table 1: Performance comparison for BF with 15 agents on 100 test runs 
Avg. Conv. Diameter ICR Velocity Convergence 


DNC 14.13 0 0.15 


DMPC 13.67 0 0.11 
CMPC 13.84 0 0.10 
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Table 2: DNC Performance Generalization for BF 
Agents Avg. Conv. Conv. Avg. Conv. ICR 


Diameter Rate (%) Time 
15 14.13 100 52.15 0 
20 16.45 97 58.76 0 
25 19.81 94 64.11 0 
30 23.24 92 72.08 0 
35 30.57 86 83.84 0.008 
40 38.66 81 95.32 0.019 


5.3 Results for Obstacle and Predator Avoidance 


For obstacle and predator avoidance, collision rates are used as a performance 
metric. An obstacle-agent collision (OC) occurs when the distance between an 
agent and the closest point on any obstacle is less than d?s,. A predator-agent 
collision (PC) occurs when the distance between an agent and the predator is less 
than d?"°, The OC rate (OCR) is the average number of OCs per test-trajectory 
time-step, and the PC rate (PCR) is defined similarly. Our test results show 
that the DNC, along with the DMPC and CMPC, is collision-free (i.e., each 
of ICR, OCR, and PCR is zero) for 15 agents, with the exception of DMPC 
for predator avoidance where PCR = 0.013. We also observed that the flock 


successfully reaches the target location in all 100 test runs. 


5.4 DNC Generalization Results 


Tables [2}{3] present DNC generalization results for basic flocking (BF), obstacle 
avoidance (OA), and predator avoidance (PA), with the number of agents ranging 
from 15 (the flock size during training) to 40. In all of these experiments, we use 
a neighborhood size of M = 14, the same as during training. Each controller was 
evaluated with 100 test runs. The performance metrics in Table[2]are the average 
converged diameter, convergence rate, average convergence time, and ICR. 

The convergence rate is the fraction of successful flocks over 100 runs. The 
collection of agents is said to have converged to a flock (with collision avoidance) 
if the value of the global cost function is less than the convergence threshold. 
We use a convergence threshold of Jı(p) < 150, which was chosen based on its 
proximity to the value achieved by CMPC. We use the cost function from Eq. 
to calculate our success rate because we are showing convergence rate for basic 
flocking. The average convergence time is the time when the global cost function 
first drops below the success threshold and remains below it for the rest of the 
run, averaged over all 100 runs. Even with a local neighborhood of size 14, the 
results demonstrate that the DNC can successfully generalize to a large number 
of agents for all of our control objectives. 
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Table 3: DNC Generalization Performance for OA and PA 


OA PA 
Agents ICR OCR ICR POR 
15 0 0 0 0 
20 0 0 0 0 
25 0 0 0 0 
30 0 0 0 0 


35 0.011 0.009 0.013 0.010 
40 0.021 0.018 0.029 0.023 


5.5 Statistical Model Checking Results 


We use Monte Carlo (MC) approximation as a form of Statistical Model Check- 
ing to compute confidence intervals for the DNC’s convergence rate to a 
flock with collision avoidance and for the (normalized) convergence time. The 
convergence rate is the fraction of successful flocks over N runs. The collection 
of agent is said to have converged to a successful flock with collision avoidance 
if the global cost function Jı(p) < 150, where Jı (p) is cost function for basic 
flocking defined in Eq. 

The main idea of MC is to use N random variables, Z1,..., Zy, also called 
samples, IID distributed according to a random variable Z with mean uz, and to 
take the sum ñz = (Zı +... + Zn)/N as the value approximating the mean pz. 
Since an exact computation of ug is almost always intractable, an MC approach 
is used to compute an (e€, 6)-approximation of this quantity. 

Additive Approximation [6] is an (€,6)-approximation scheme where the mean 
uz of an RV Z is approximated with absolute error € and probability 1 — 6: 


Pr|uz— e< fiz <pz+ed>1—4 (13) 


where ñz is an approximation of uz. An important issue is to determine the 
number of samples N needed to ensure that fiz is an (€, 6)-approximation of uz. If 
Z is a Bernoulli variable expected to be large, one can use the Chernoff-Hoeffding 
instantiation of the Bernstein inequality and take N to be N = 41n(2/6)/e?, 
as in [6]. This results in the additive approximation algorithm 5l, defined in 
Algorithm 1. 

We use this algorithm to obtain a joint (e, ô)-approximation of the mean 
convergence rate and mean normalized convergence time for the DNC. Each 
sample Z; is based on the result of an execution obtained by simulating the 
system starting from a random initial state, and we take Z = (B, R), where B 
is a Boolean variable indicating whether the agents converged to a flock during 
the execution, and R is a real value denoting the normalized convergence time. 
The normalized convergence time is the time when the global cost function first 
drops below the convergence threshold and remains below it for the rest of the 
run, measured as a fraction of the total duration of the run. The assumptions 
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Algorithm 1: Additive Approximation Algorithm 
Input: (e, ô) withO<e<land0<d<1 
Input: Random variables Z;, IID 
Output: ñz approximation of uz 
N = 4ln(2/6)/e?; 
for (i=0; i < N; i++) do 
Hz = S/N; return jiz; 


Table 4: SMC results for DNC convergence rate and normalized convergence 
time; € = 0.01, 6 = 0.0001 
Agents HOR Hor 


15 0.99 0.53 
20 0.97 0.58 
25 0.94 0.65 
30 0.91 0.71 
35 0.86 0.84 
40 0.80 0.95 


about Z required for validity of the additive approximation hold, because RV B 
is a Bernoulli variable, the convergence rate is expected to be large (i.e., closer 
to 1 than to 0), and the proportionality constraint of the Bernstein inequality is 
also satisfied for RV R. 

In these experiments, the initial configurations are sampled from the same 
distributions as in Section [5.1] and we set e€ = 0.01 and 6 = 0.0001, to obtain N = 
396,140. We perform the required set of N simulations for 15, 20, 25, 30, 35 and 
40 agents. Table[4] presents the results, specifically, the (€, 6)-approximations jicr 
and jicr of the mean convergence rate and the mean normalized convergence 
time, respectively. While the results for the convergence rate are (as expected) nu- 
merically similar to the results in Table[2] the results in Table/4Jare much stronger, 
because they come with the guarantee that they are (€,6)-approximations of the 
actual mean values. 


6 Related Work 


In [18], a flocking controller is synthesized using multi-agent reinforcement learning 
(MARL) and natural evolution strategies (NES). The target model from which 
the system learns is Reynolds flocking model [16]. For training purposes, a list 
of metrics called entropy are chosen, which provide a measure of the collective 
behavior displayed by the target model. As the authors of observe, this 
technique does not quite work: although it consistently leads to agents forming 
recognizable patterns during simulation, agents self-organized into a cluster 
instead of flowing like a flock. 
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In pl, reinforcement learning and flocking control are combined for the 
purpose of predator avoidance, where the learning module determines safe spaces 
in which the flock can navigate to avoid predators. Their approach to predator 
avoidance, however, isn’t distributed as it requires a majority consensus by the 
flock to determine its action to avoid predators. They also impose an a-lattice 
structure on the flock. In contrast, our approach is geometry-agnostic and 
achieves predator avoidance in a distributed manner. 

In [7], an uncertainty-aware reinforcement learning algorithm is developed 
to estimate the probability of a mobile robot colliding with an obstacle in an 
unknown environment. Their approach is based on bootstrap neural networks 
using dropouts, allowing it to process raw sensory inputs. Similarly, a learning- 
based approach to robot navigation and obstacle avoidance is presented in [14]. 
They train a model that maps sensor inputs and the target position to motion 
commands generated by the ROS navigation package. Our work in contrast 
considers obstacle avoidance (and other control objectives) in a multi-agent 
flocking scenario under the simplifying assumption of full state observation. 

In [4], an approach based on Bayesian inference is proposed that allows an 
agent in a heterogeneous multi-agent environment to estimate the navigation 
model and goal of each of its neighbors. It then uses this information to compute 
a plan that minimizes inter-agent collisions while allowing the agent to reach its 
goal. Flocking formation is not considered. 


7 Conclusions 


With the introduction of Neural Flocking (NF), we have shown how machine 
learning in the form of Supervised Learning can bring many benefits to the 
flocking problem. As our experimental evaluation confirms, the symmetric and 
fully distributed neural controllers we derive in this manner are capable of 
achieving a multitude of flocking-oriented objectives, including flocking formation, 
inter-agent collision avoidance, obstacle avoidance, predator avoidance, and target 
seeking. Moreover, NF controllers exhibit real-time performance and generalize 
the behavior seen in the training data to achieve these objectives in a significantly 
broader range of scenarios. 

Ongoing work aims to determine whether a DNC can perform as well as 
the centralized MPC controller for agent models that are significantly more 
realistic than our current point-based model. For this purpose, we are using 
transfer learning to train a DNC that can achieve acceptable performance on 
realistic quadrotor dynamics i, starting from our current point-model-based 
DNC. This effort also involves extending our current DNC from 2-dimensional 
to 3-dimensional spatial coordinates. If successful, and preliminary results are 
encouraging, this line of research will demonstrate that DNCs are capable of 
achieving flocking with complex realistic dynamics. 

For future work, we plan to investigate a distance-based notion of agent neigh- 
borhood as opposed to our current nearest-neighbors formulation. Furthermore, 
motivated by the quadrotor study of (21), we will seek to combine MPC with 


Neural Flocking: MPC-based Supervised Learning of Flocking Controllers 15 


reinforcement learning in the framework of guided policy search as an alternative 
solution technique for the NF problem. 
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Abstract This paper studies fundamental questions concerning category- 
theoretic models of induction and recursion. We are concerned with 
the relationship between well-founded and recursive coalgebras for an 
endofunctor. For monomorphism preserving endofunctors on complete 
and well-powered categories every coalgebra has a well-founded part, 
and we provide a new, shorter proof that this is the coreflection in 
the category of all well-founded coalgebras. We present a new more 
general proof of Taylor’s General Recursion Theorem that every well- 
founded coalgebra is recursive, and we study conditions which imply the 
converse. In addition, we present a new equivalent characterization of 
well-foundedness: a coalgebra is well-founded iff it admits a coalgebra-to- 
algebra morphism to the initial algebra. 


Keywords: Well-founded : Recursive > Coalgebra - Initial Algebra - 
General Recursion Theorem 


1 Introduction 


What is induction? What is recursion? In areas of theoretical computer science, 
the most common answers are related to initial algebras. Indeed, the dominant 
trend in abstract data types is initial algebra semantics (see e.g. [19]), and this 
approach has spread to other semantically-inclined areas of the subject. The 
approach in broad slogans is that, for an endofunctor F describing the type of 
algebraic operations of interest, the initial algebra uF has the property that 
for every F-algebra A, there is a unique homomorphism uF — A, and this is 
recursion. Perhaps the primary example is recursion on N, the natural numbers. 
Recall that N is the initial algebra for the set functor FX = X +1. If A is any 
set, anda € A and a: A A + 1 are given, then initiality tells us that there is 
a unique f: N — A such that for all n € N, 


fO)=a — f(n+1)=a(f(n)). (1.1) 


* A full version of this paper including full proof details is available on arXiv [5]. 
** Supported by the Grant Agency of the Czech Republic under grant 19-00902S. 
*** Supported by Deutsche Forschungsgemeinschaft (DFG) under project MI 717/5-2. 
* Supported by grant #586136 from the Simons Foundation. 
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J. Goubault-Larrecq and B. König (Eds.): FOSSACS 2020, LNCS 12077, pp. 17-36, 2020. 
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Then the first additional problem coming with this approach is that of how to 
“recognize” initial algebras: Given an algebra, how do we really know if it is 
initial? The answer — again in slogans — is that initial algebras are the ones with 
“no junk and no confusion.” 

Although initiality captures some important aspects of recursion, it cannot be 
a fully satisfactory approach. One big missing piece concerns recursive definitions 
based on well-founded relations. For example, the whole study of termination 
of rewriting systems depends on well-orders, the primary example of recursion 
on a well-founded order. Let (X, R) be a well-founded relation, i.e. one with no 
infinite sequences --- 22 R xı Rap. Let A be any set, and let a: AA — A. (Here 
and below, # is the power set functor, taking a set to the set of its subsets.) 
Then there is a unique f: X — A such that for all x € X, 


f(x) =a f(y): y R a}). (1.2) 


The main goal of this paper is the study of concepts that allow one to extend 
the algebraic spirit behind initiality in (1.1) to the setting of recursion arising 
from well-foundedness as we find it in (1.2). The corresponding concepts are 
those of well-founded and recursive coalgebras for an endofunctor, which first 
appear in work by Osius [22] and Taylor [23,24], respectively. In his work on 
categorical set theory, Osius [22] first studied the notions of well-founded and 
recursive coalgebras (for the power-set functor on sets and, more generally, the 
power-object functor on an elementary topos). He defined recursive coalgebras 
as those coalgebras a: A + ZA which have a unique coalgebra-to-algebra 
homomorphism into every algebra (see Definition 3.2). 

Taylor [23,24] took Osius’ ideas much further. He introduced well-founded 
coalgebras for a general endofunctor, capturing the notion of a well-founded rela- 
tion categorically, and considered recursive coalgebras under the name ‘coalgebras 
obeying the recursion scheme’. He then proved the General Recursion Theorem 
that all well-founded coalgebras are recursive, for every endofunctor on sets (and 
on more general categories) preserving inverse images. Recursive coalgebras were 
also investigated by Eppendahl [12], who called them algebra-initial coalgebras. 
Capretta, Uustalu, and Vene [10] further studied recursive coalgebras, and they 
showed how to construct new ones from given ones by using comonads. They 
also explained nicely how recursive coalgebras allow for the semantic treatment 
of (functional) divide-and-conquer programs. More recently, Jeannin et al. [15] 
proved the General Recursion Theorem for polynomial functors on the category 
of many-sorted sets; they also provide many interesting examples of recursive 
coalgebras arising in programming. 

Our contributions in this paper are as follows. We start by recalling some pre- 
liminaries in Section 2 and the definition of (parametrically) recursive coalgebras 
in Section 3 and of well-founded coalgebras in Section 4 (using a formulation 
based on Jacobs’ next time operator [14], which we extend from Kripke poly- 
nomial set functors to arbitrary functors). We show that every coalgebra for a 
monomorphism preserving functor on a complete and well-powered category has 
a well-founded part, and provide a new proof that this is the coreflection in the 
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category of well-founded coalgebras (Proposition 4.19), shortening our previous 
proof [6]. Next we provide a new proof of Taylor’s General Recursion Theorem 
(Theorem 5.1), generalizing this to endofunctors preserving monomorphisms on a 
complete and well-powered category having smooth monomorphisms (see Defini- 
tion 2.8). For the category of sets, this implies that “well-founded = recursive” 
holds for all endofunctors, strengthening Taylor’s result. We then discuss the 
converse: is every recursive coalgebra well-founded? Here the assumption that F 
preserves inverse images cannot be lifted, and one needs additional assumptions. 
In fact, we present two results: one assumes universally smooth monomorph- 
isms and that the functor has a pre-fixed point (see Theorem 5.5). Under these 
assumptions we also give a new equivalent characterization of recursiveness 
and well-foundedness: a coalgebra is recursive if it has a coalgebra-to-algebra 
morphism into the initial algebra (which exists under our assumptions), see Co- 
rollary 5.6. This characterization was previously established for finitary functors 
on sets [3]. The other converse of the above implication is due to Taylor using 
the concept of a subobject classifier (Theorem 5.8). It implies that ‘recursive’ 
and ‘well-founded’ are equivalent concepts for all set functors preserving inverse 
images. We also prove that a similar result holds for the category of vector spaces 
over a fixed field (Theorem 5.12). 

Finally, we show in Section 6 that well-founded coalgebras are closed under 
coproducts, quotients and, assuming mild assumptions, under subcoalgebras. 


2 Preliminaries 


We start by recalling some background material. Except for the definitions of 
algebra and coalgebra in Subsection 2.1, the subsections below may be read as 
needed. We assume that readers are familiar with notions of basic category theory; 
see e.g. [2] for everything which we do not detail. We indicate monomorphisms 
by writing — and strong epimorphisms by —. 


2.1 Algebras and Coalgebras. We are concerned throughout this paper 
with algebras and coalgebras for an endofunctor. This means that we have an 
underlying category, usually written æ% ; frequently it is the category of sets or 
of vector spaces over a fixed field, and that a functor F: & > æ is given. An 
F-algebra is a pair (A, a), where a: FA > A. An F-coalgebra is a pair (A, a), 
where a: A + FA. We usually drop the functor F. Given two algebras (A, a) 
and (B,), an algebra homomorphism from the first to the second is h: A > B 
in & such that h-a = 8 - Fh. Similarly, a coalgebra homomorphism satisfies 
B- h= Fh. a. We denote by Coalg F the category of all coalgebras for F. 


Example 2.1. (1) The power set functor Y: Set — Set takes a set X to the set 
PX of all subsets of it; for a morphism f: X > Y, Af: AX > YY takes a 
subset S C X to its direct image f[S]. Coalgebras a: X + YX may be identified 
with directed graphs on the set X of vertices, and the coalgebra structure a 
describes the edges: b € a(a) means that there is an edge a > b in the graph. 
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(2) Let X be a signature, i.e. a set of operation symbols, each with a finite arity. 
The polynomial functor Hs associated to X assigns to a set X the set 


Hak = || 5n x X”, 
nEeEN 


where X, is the set of operation symbols of arity n. This may be identified with 
the set of all terms o(z1,..., £n), for o € Xn, and z1,..., £n E X. Algebras for 
Hy are the usual X-algebras. 

(3) Deterministic automata over an input alphabet X are coalgebras for the 
functor FX = {0,1} x X~. Indeed, given a set S of states, a next-state map 
S x X — S may be curried to 6: S + S”. The set of final states yields the 
acceptance predicate a: S — {0,1}. So an automaton may be regarded as a 
coalgebra (a, ô): S — {0,1} x S*. 

(4) Labelled transitions systems are coalgebras for FX = P(X x X). 

(5) To describe linear weighted automata, i.e. weighted automata over the input 
alphabet X with weights in a field K, as coalgebras, one works with the category 
Vecx of vector spaces over K. A linear weighted automaton is then a coal- 
gebra for FX = K x X~. 


2.2 Preservation Properties. Recall that an intersection of two subobjects 
si: Si; > A (i = 1,2) of a given object A is given by their pullback. Analogously, 
(general) intersections are given by wide pullbacks. Furthermore, the inverse 
image of a subobject s: S — B under a morphism f: A — B is the subobject 
t: T > A obtained by a pullback of s along f. 

All of the ‘usual’ set functors preserve intersections and inverse images: 


Example 2.2. (1) Every polynomial functor preserves intersections and inverse 
images. 

(2) The power-set functor Y preserves intersections and inverse images. 

(3) Intersection-preserving set functors are closed under taking coproducts, 
products and composition. Similarly, for inverse images. 

(4) Consider next the set functor R defined by RX = {(x,y) €@ Xx X: uF 
y} + {d} for sets X. For a function f: X —> Y put Rf(x,y) = (f(x), f(y)) if 
f(x) Æ f(y), and d otherwise. R preserves intersections but not inverse images. 


Proposition 2.3 [27]. For every set functor F there exists an essentially unique 
set functor F which coincides with F on nonempty sets and functions and 
preserves finite intersections (whence monomorphisms). 


Remark 2.4. (1) In fact, Trnková gave a construction of F: she defined FØ as 
the set of all natural transformations Co, —> F, where Co, is the set functor with 
Coi9 = Ø and Co1X = 1 for all nonempty sets X. For the empty map e: 0 —> X 
with X #0, Fe maps a natural transformation T: Co, > F to the element given 
by Tx: 1> FX. 

(2) The above functor F is called the Trnková hull of F. It allows us to achieve 
preservation of intersections for all finitary set functors. Intuitively, a functor on 
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sets is finitary if its behavior is completely determined by its action on finite sets 
and functions. For a general functor, this intuition is captured by requiring that 
the functor preserves filtered colimits [8]. For a set functor F this is equivalent to 
being finitely bounded, which is the following condition: for each element x € FX 
there exists a finite subset M C X such that x € Fi[FM], where i: M => X is 
the inclusion map [7, Rem. 3.14]. 


Proposition 2.5 [4, p. 66]. The Trnková hull of a finitary set functor preserves 
all intersections. 


2.3 Factorizations. Recall that an epimorphism e: A —> B is called strong 
if it satisfies the following diagonal fill-in property: given a monomorphism 
m: C >= D and morphisms f: A > C and g: B + D such that m- f = g-e 
then there exists a unique d: B > C such that f = d-e and g=™m™-d. 

Every complete and well-powered category has factorizations of morphisms: 
every morphism f may be written as f = m-e, where e is a strong epimorphism 
and m is a monomorphism [9, Prop. 4.4.3]. We call the subobject m the image 
of f. It follows from a result in Kurz’ thesis [16, Prop. 1.3.6] that factorizations 
of morphisms lift to coalgebras: 


Proposition 2.6 (Coalg F inherits factorizations from &). Suppose that 
F preserves monomorphisms. Then the category Coalg F has factorizations of 
homomorphisms f as f =m -e, where e is carried by a strong epimorphism and 
m by a monomorphism in £. The diagonal fill-in property holds in Coalg F. 


Remark 2.7. By a subcoalgebra of a coalgebra (A,a@) we mean a subobject 
in Coalg F represented by a homomorphism m: (B, 8) — (A,a), where m is 
monic in &. Similarly, by a strong quotient of a coalgebra (A, a) we mean one 
represented by a homomorphism e: (A, aœ) —> (C, y) with e strongly epic in &. 


2.4 Chains. By a transfinite chain in a category æ we understand a functor 
from the ordered class Ord of all ordinals into <. Moreover, for an ordinal À, a 
A-chain in & is a functor from » to &. A category has colimits of chains if for 
every ordinal it has a colimit of every \-chain. This includes the initial object 
0 (the case A = 0). 


Definition 2.8. (1) A category Y has smooth monomorphisms if for every 
A-chain C of monomorphisms a colimit exists, its colimit cocone is formed 
by monomorphisms, and for every cone of C formed by monomorphisms, the 
factorizing morphism from colim C is monic. In particuar, every morphism from 
0 is monic. 

(2) & has universally smooth monomorphisms if & also has pullbacks, and 
for every morphism f: X — colimC, the functor &/colimC > &/X forming 
pullbacks along f preserves the colimit of C. This implies that initial object 0 
is strict, i.e. every morphism f: X — 0 is an isomorphism. Indeed, consider the 
empty chain (A = 0). 


Example 2.9. (1) Set has universally smooth monomorphisms. 
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(2) Vec has smooth monomorphisms, but not universally so because the initial 
object is not strict. 

(3) Categories in which colimits of chains and pullbacks are formed “set-like” 
have universally smooth monomorphisms. These include the categories of posets, 
graphs, topological spaces, presheaf categories, and many varieties, such as 
monoids, groups, and unary algebras. 

(4) Every locally finitely presentable category æ% with a strict initial object (see 
Remark 2.12(1)) has smooth monomorphisms. This follows from [8, Prop. 1.62]. 
Moreover, since pullbacks commute with colimits of chains, it is easy to prove 
that colimits of chains are universal using the strictness of 0. 

(5) The category CPO of complete partial orders does not have smooth mono- 
morphisms. Indeed, consider the w-chain of linearly ordered sets Ap, = {0,...,}+ 
{T} (T atop element) with inclusion maps A, —> An+1. Its colimit is the linearly 
ordered set N+ {T, T’} of natural numbers with two added top elements T’ < T. 
For the sub-cpo N + {T}, the inclusions of A, are monic and form a cocone. But 
the unique factorizing morphism from the colimit is not monic. 


Notation 2.10. For every object A we denote by Sub(A) the poset of all subob- 
jects of A (represented by monomorphisms s: S — A), where s < s’ if there exists 
i with s = s' - i. If & has pullbacks we have, for every morphism f: A —> B, the 
inverse image operator, viz. the monotone map f: Sub(B) — Sub(A) assigning 
to a subobject s: S — A the subobject of B obtained by forming the inverse 
image of s under f, i.e. the pullback of s along f. 


<i 
Lemma 2.11. If & is complete and well-powered, then f has a left adjoint 
given by the (direct) image operator f : Sub(A) —> Sub(B). It maps a subobject 
t: T — B to the subobject of A given by the image of f - t; in symbols we have 


T(t) <s ifft F(s). 


Remark 2.12. If & is a complete and well-powered category, then Sub(A) is a 
complete lattice. Now suppose that .” has smooth monomorphisms. 


(1) In this setting, the unique morphism L4: 0 — A is a monomorphism and 
therefore is the bottom element of the poset Sub(A). 

(2) Furthermore, a join of a chain in Sub(A) is obtained by forming a colimit, in 
the obvious way. 

(3) If æ has universally smooth monomorphisms, then for every morphism 
f: A— B, the operator f : Sub(B) — Sub(A) preserves unions of chains. 


Remark 2.13. Recall [1] that every endofunctor F yields the initial-algebra 
chain, viz. a transfinite chain formed by the objects F"0 of æ, as follows: F°0 = 0, 
the initial object; F’+10 = F(FŻ0), and for a limit ordinal i we take the colimit 
of the chain (F!0);<;. The connecting morphisms w; j: F'O — F0 are defined 
by a similar transfinite recursion. 
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3 Recursive Coalgebras 


Assumption 3.1. We work with a standard set theory (e.g. Zermelo-Fraenkel), 
assuming the Axiom of Choice. In particular, we use transfinite induction on 
several occasions. (We are not concerned with constructive foundations in this 
paper.) 

Throughout this paper we assume that & is a complete and well-powered 
category æ% and that F: & — & preserves monomorphisms. 


For æ% = Set the condition that F preserves monomorphisms may be dropped. 
In fact, preservation of non-empty monomorphism is sufficient in general (for a 
suitable notion of non-empty monomorphism) [21, Lemma 2.5], and this holds 
for every set functor. 

The following definition of recursive coalgebras was first given by Osius [22]. 
Taylor [24] speaks of coalgebras obeying the recursion scheme. Capretta et al. [10] 
extended the concept to parametrically recursive coalgebra by dualizing completely 
iterative algebras [20]. 


Definition 3.2. A coalgebra a: A —> FA is called recursive if for every algebra 
e: FX > X there exists a unique coalgebra-to-algebra morphism et: A > X, 
i.e. a unique morphism such that the square on the left below commutes: 


A“ 4x A— * x 
O nn, 
FA Æ FX FAx A As FXxA 


(A, a) is called parametrically recursive if for every morphism e: FX x A > X 
there is a unique morphism et: A — X such that the square on the right above 
commutes. 


Example 3.3. (1) A graph regarded as a coalgebra for # is recursive iff it has 
no infinite path. This is an immediate consequence of the General Recursion 
Theorem (see Corollary 5.6 and Example 4.5(2)). 

(2) Let ı: F(uF) > uF be an initial algebra. By Lambek’s Lemma, + is an 
isomorphism. So we have a coalgebra .~!: pF — F(pF). This algebra is (para- 
metrically) recursive. By [20, Thm. 2.8], in dual form, this is precisely the same 
as the terminal parametrically recursive coalgebra (see also [10, Prop. 7]). 

(3) The initial coalgebra 0 — FO is recursive. 

(4) If (C, y) is recursive so is (FC, Fy), see [10, Prop. 6]. 

(5) Colimits of recursive coalgebras in Coalg F are recursive. This is easy to 
prove, using that colimits of coalgebras are formed on the level of the underlying 
category. 

(6) It follows from items (3)-(5) that in the initial-algebra chain from Re- 
mark 2.13 all coalgebras wi i+1: F'0 — F*t10, i € Ord, are recursive. 
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(7) Every parametrically recursive coalgebra is recursive. (To see this, form for 
a given e: FX — X the morphism e’ = e- 7, where 7: FX x A —> FX is the 
projection.) In Corollaries 5.6 and 5.9 we will see that the converse often holds. 

Here is an example where the converse fails [3]. Let R: Set — Set be the 
functor defined in Example 2.2(4). Also, let C = {0,1}, and define y: C > RC 
by y(0) = y(1) = (0,1). Then (C, y) is a recursive coalgebra. Indeed, for every 
algebra a: RA > A the constant map h: C > A with h(0) = h(1) = a(d) is the 
unique coalgebra-to-algebra morphism. 

However, (C,y) is not parametrically recursive. To see this, consider any 

morphism e: RX x {0,1} > X such that RX contains more than one pair 
(x0, £1), £o Æ xı with e((zo, £1), i) = z; for i = 0,1. Then each such pair yields 
h: C + X with h(i) = z; making the appropriate square commutative. Thus, 
(C, y) is not parametrically recursive. 
(8) Capretta et al. [11] showed that recursivity semantically models divide-and- 
conquer programs, as demonstrated by the example of Quicksort. For every 
linearly ordered set A (of data elements), Quicksort is usually defined as the 
recursive function q: A* > A* given by 


qé)=e and q(aw)= q(wea) * (ag(w>a)), 


where A* is the set of all lists on A, € is the empty list, x is the concatenation of 
lists and w<q denotes the list of those elements of w which are less than or equal 
than a; analogously for wy,. 

Now consider the functor FX = 1+ Ax X x X on Set, where 1 = {e}, and 
form the coalgebra s: A* > 1 + A x A* x A* given by 


s(£) =e and s(aw) = (a, W<a, Wsa) fora € A and we A*. 


We shall see that this coalgebra is recursive in Example 5.3. Thus, for the 
F-algebra m : 1 + Ax A* x A* > A* given by 


m(e) =e and m(a, w,v) = wx (av) 


there exists a unique function q on A* such that q = m- Fq- s. Notice that the 
last equation reflects the idea that Quicksort is a divide-and-conquer algorithm. 
The coalgebra structure s divides a list into two parts w<a and wyq. Then Fq 
sorts these two smaller lists, and finally in the combine- (or conquer-) step, the 
algebra structure m merges the two sorted parts to obtain the desired whole 
sorted list. 


Jeannin et al. [15, Sec. 4] provide a number of recursive functions arising in 
programming that are determined by recursivity of a coalgebra, e.g. the gcd of 
integers, the Ackermann function, and the Towers of Hanoi. 


4 The Next Time Operator and Well-Founded Coalgebras 


As we have mentioned in the Introduction, the main issue of this paper is the 
relationship between two concepts pertaining to coalgebras: recursiveness and 
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well-foundedness. The concept of well-foundedness is well-known for directed 
graphs (G,—): it means that there are no infinite directed paths go > gı >---. 
For a set X with a relation R, well-foundedness means that there are no backwards 
sequences --- R £2 R xı R xo, i.e. the converse of the relation is well-founded as a 
graph. Taylor [24, Def. 6.2.3] gave a more general category theoretic formulation 
of well-foundedness. We observe here that his definition can be presented in a 
compact way, by using an operator that generalizes the way one thinks of the 
semantics of the ‘next time’ operator of temporal logics for non-deterministic (or 
even probabilistic) automata and transitions systems. It is also strongly related 
to the algebraic semantics of modal logic, where one passes from a graph G 
to a function on AG. Jacobs [14] defined and studied the ‘next time’ operator 
on coalgebras for Kripke polynomial set functors. This can be generalized to 
arbitrary functors as follows. 
Recall that Sub(A) denotes the complete lattice of subobjects of A. 


Definition 4.1 [4, Def. 8.9]. Every coalgebra a: A > FA induces an endo- 
function on Sub(A), called the nezt time operator 


O: Sub(A) > Sub(A), O(s) = &(Fs) for s€ Sub(A). 


In more detail: we define Os and a(s) by the pullback in (4.1). (Being a pullback 
is indicated by the “corner” symbol.) In words, O aa 

assigns to each subobject s: S — A the inverse image OS —> FS 

of Fs under a. Since F's is a monomorphism, Os is a os| -A [rs (41) 
monomorphism and a(s) is (for every representation ms 

Os of that subobject of A) uniquely determined. A—> FA 
Example 4.2. (1) Let A bea graph, considered as a coalgebra for A: Set — Set. 
If S C Aisa set of vertices, then CS is the set of vertices all of whose successors 
belong to S. 

(2) For the set functor FX = P(X x X) expressing labelled transition systems 
the operator © for a coalgebra a: A > P(X x A) is the semantic counterpart 
of the next time operator of classical linear temporal logic, see e.g. Manna and 
Pniieli [18]. In fact, for a subset S — A we have that OS consists of those states 
all of whose next states lie in S, in symbols: 


OS = {x € A | (s,y) € a(x) implies y € S, for all s € X}. 
The next time operator allows a compact definition of well-foundedness as 
characterized by Taylor [24, Exercise VI.17] (see also [6, Corollary 2.19]): 
Definition 4.3. A coalgebra is well-founded if id is the only fixed point of its 
next time operator. 
Remark 4.4. (1) Let us call a subcoalgebra m: (B,8) — (A,a) cartesian 
provided that the square (4.2) is a pullback. Then 


(A, a) is well-founded iff it has no proper cartesian Bp’ 4 FB 
subcoalgebra. That is, if m: (B,8) — (A,a) is a a (4.2) 

A H A i m Fm # 
cartesian subcoalgebra, then m is an isomorphism. i | 


Indeed, the fixed points of next time are precisely the A —= FA 
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cartesian subcoalgebras. 

(2) A coalgebra is well-founded iff O has a unique pre-fixed point Om < m. 
Indeed, since Sub(A) is a complete lattice, the least fixed point of a monotone 
map is its least pre-fixed point. Taylor’s definition [24, Def. 6.3.2] uses that 
property: he calls a coalgebra well-founded iff O has no proper subobject as a 
pre-fixed point. 


Example 4.5. (1) Consider a graph as a coalgebra a: A > YA for the power- 
set functor (see Example 2.1). A subcoalgebra is a subset m: B © A such 
that with every vertex v it contains all neighbors of v. The coalgebra structure 
8: B > PB is then the domain-codomain restriction of a. To say that B is a 
cartesian subcoalgebra means that whenever a vertex of A has all neighbors in 
B, it also lies in B. It follows that (A, œ) is well-founded iff it has no infinite 
directed path, see [24, Example 6.3.3]. 

(2) If wF exists, then as a coalgebra it is well-founded. Indeed, in every pull- 
back (4.2), since .~1 (as a) is invertible, so is 6. The unique algebra homomorph- 
ism from uF to the algebra 671: FB — B is clearly inverse to m. 

(3) Ifa set functor F fulfils FØ = 0, then the only well-founded coalgebra is the 
empty one. Indeed, this follows from the fact that the empty coalgebra is a fixed 
point of ©). For example, a deterministic automaton over the input alphabet X, 
as a coalgebra for FX = {0,1} x X~, is well-founded iff it is empty. 

(4) A non-deterministic automaton may be considered as a coalgebra for the set 
functor FX = {0,1} x (#AX)~. It is well-founded iff the state transition graph 
is well-founded (i.e. has no infinite path). This follows from Corollary 4.10 below. 
(5) A linear weighted automaton, i.e. a coalgebra for FX = K x X~ on Vecx, 
is well-founded iff every path in its state transition graph eventually leads to 0. 
This means that every path starting in a given state leads to the state 0 after 
finitely many steps (where it stays). 


Notation 4.6. Given a set functor F, we define for every set X the map 
tT: FX + FX assigning to every element x € FX the intersection of all 
subsets m: M —> X such that x lies in the image of Fm: 


tx(x) =(\{m|m: M > X satisfies « € Fm{FM]}. (4.3) 


Recall that a functor preserves intersections if it preserves (wide) pullbacks 
of families of monomorphisms. 

Gumm [13, Thm. 7.3] observed that for a set functor preserving intersections, 
the maps Tx: FX > YX in (4.3) form a “subnatural” transformation from F 
to the power-set functor Z. Subnaturality means that (although these maps do 
not form a natural transformation in general) for every monomorphism i: X > Y 
we have a commutative square: 


rif [a (4.4) 
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Remark 4.7. As shown in [13, Thm. 7.4] and [23, Prop. 7.5], a set functor F 
preserves intersections iff the squares in (4.4) above are pullbacks. Moreover, 
loc. cit. and [13, Thm. 8.1] prove that +: F > F is a natural transformation, 
provided F preserves inverse images and intersections. 


Definition 4.8. Let F be a set functor. For every coalgebra a: A > FA its 
canonical graph is the following coalgebra for 2: A “> FA > AA. 


Thanks to the subnaturality of 7 one obtains the following results. 


Proposition 4.9. For every set functor F preserving intersections, the neat 
time operator of a coalgebra (A, a) coincides with that of its canonical graph. 


Corollary 4.10 [24, Rem. 6.3.4]. A coalgebra for a set functor preserving 
intersections is well-founded iff its canonical graph is well-founded. 


Example 4.11. (1) For a (deterministic or non-deterministic) automaton, the 
canonical graph has an edge from s to t iff there is a transition from s to t for 
some input letter. Thus, we obtain the characterization of well-foundedness as 
stated in Example 4.5(3) and (4). 

(2) Every polynomial functor Hy: Set — Set preserves intersections. Thus, a 
coalgebra (A, qa) is well-founded if there are no infinite paths in its canonical 
graph. The canonical graph of A has an edge from a to b if a(a) is of the form 
a(C1,---,€n) for some o € Xn and if b is one of the c;’s. 

(3) Thus, for the functor FX = 1+ Ax X x X, the coalgebra (A*,s) of 
Example 3.3(8) is easily seen to be well-founded via its canonical graph. Indeed, 
this graph has for every list w one outgoing edge to the list w<q and one to W>a 
for every a € A. Hence, this is a well-founded graph. 


Lemma 4.12. The next time operator is monotone: if m <n, then Om < On. 


Lemma 4.13. Let a: A— FA be a coalgebra and m: B — A a subobject. 
(1) There is a coalgebra structure 8: B + FB for which m gives a subcoalgebra 


of (A, a) if m < Om. 
(2) There is a coalgebra structure 8: B + FB for which m gives a cartesian 
subcoalgebra of (A, a) iff m = Om. 


Lemma 4.14. For every coalgebra homomorphism f: (B, 6) > (A,a) we have 
eo 
Og : f < f . Oa; 


where Qa and Og denote the next time operators of the coalgebras (A,a) and 
(B, 8), respectively, and < is the pointwise order. 


Corollary 4.15. For every coalgebra homomorphism f:(B,8) > (A,a) we 
have Os: f = f Oa, provided that either 
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(1) f is a monomorphism in £ and F preserves finite intersections, or 


(2) F preserves inverse images. 


Definition 4.16 [4]. The well-founded part of a coalgebra is its largest well- 
founded subcoalgebra. 


The well-founded part of a coalgebra always exists and is the coreflection 
in the category of well-founded coalgebras [6, Prop. 2.27]. We provide a new, 
shorter proof of this fact. The well-founded part is obtained by the following: 


Construction 4.17 [6, Not. 2.22]. Let a: A— FA be a coalgebra. We know 
that Sub(A) is a complete lattice and that the next time operator © is monotone 
(see Lemma 4.12). Hence, by the Knaster-Tarski fixed point theorem, © has a 
least fixed point, which we denote by a*: A* — A. 

By Lemma 4.13(2), we know that there is a coalgebra structure a*: A* > F A* 
so that a*: (A*,a*) — (A, a) is the smallest cartesian subcoalgebra of (A, a). 


Proposition 4.18. For every coalgebra (A,a), the coalgebra (A*,a*) is well- 
founded. 


Proof. Let m: (B, 3) — (A*,a*) be a cartesian subcoalgebra. By Lemma 4.13, 
a*-m: B >= A is a fixed point of ©. Since a* is the least fixed point, we have 
a* < a* - m, i.e. a* = ař -mx for some x: A* >> B. Since a* is monic, we thus 
have m -x = ida». So m is a monomorphism and a split epimorphism, whence 
an isomorphism. 


Proposition 4.19. The full subcategory of Coalg F given by well-founded coal- 
gebras is coreflective. In fact, the well-founded coreflection of a coalgebra (A, a) 
is its well-founded part a*: (A*,a*) > (A,a). 


Proof. We are to prove that for every coalgebra homomorphism f: (B, 8) > 
(A, a), where (B, 3) is well-founded, there exists a coalgebra homomorphism 
fË: (B, B) > (A*,a*) such that a* - f? = f. The uniqueness is easy. 

For the existence of f*, we first observe that f (a*) is a pre-fixed point of 
Og: indeed, using Lemma 4.14 we have OlT (a*)) < F (Oa(a*)) = F(a"). 
By Remark 4.4(2), we therefore have idg = b* < f (a*) in Sub(B). Using the 
adjunction of Lemma 2.11, we have f (idp) < a* in Sub(A). Now factorize f as 
BC A. We have (idg) = m, and we then obtain m = f (idg) < a’*, 
i.e. there exists a morphism h: C > A* such that a* -h = m. Thus, fË = 
h-e: B => A* is a morphism satisfying a* - f? = a* - h- e = m- e = f. It follows 
that fË is a coalgebra homomorphism from (B, 3) to (A*,a*) since f and a* are 
and F preserves monomorphisms. 


Construction 4.20 [6, Not. 2.22]. Let (A,a) be a coalgebra. We obtain 
a*, the least fixed point of O, as the join of the following transfinite chain of 
subobjects aj: A; — A, i € Ord. First, put a9 = L4, the least subobject of A. 
Given a;: Aj — A, put aj41 = Oa;: Aig, = OA; — A. For every limit ordinal 
j, put aj = \V/;.,; i Since Sub(A) is a set, there exists an ordinal i such that 
a;i =a*: A* >A 
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Remark 4.21. Note that, whenever monomorphisms are smooth, we have Ap = 
0 and the above join a; is obtained as the colimit of the chain of the subobject 
ai: Ay — A, i < j (see Remark 2.12). 


If F is a finitary functor on a locally finitely presentable category, then the 
least ordinal 7 with a* = a; is at most w, but in general one needs transfinite 
iteration to reach a fixed point. 


Example 4.22. Let (A,a) be a graph regarded as a coalgebra for Y (see 
Example 2.1). Then Ao = @, Aj is formed by all leaves; i.e. those nodes with no 
neighbors, Ag by all leaves and all nodes such that every neighbor is a leaf, etc. 
We see that a node z lies in Aj+1 iff every path starting in x has length at most 
i. Hence A* = A,, is the set of all nodes from which no infinite paths start. 


We close with a general fact on well-founded parts of fixed points (i.e. (co)alge- 
bras whose structure is invertible). The following result generalizes [15, Cor. 3.4], 
and it also appeared before for functors preserving finite intersections [4, The- 
orem 8.16 and Remark 8.18]. Here we lift the latter assumption (see [5, The- 
orem 7.6] for the new proof): 


Theorem 4.23. Let & be a complete and well-powered category with smooth 
monomorphisms. For F preserving monomorphisms, the well-founded part of 
every fixed point is an initial algebra. In particular, the only well-founded fixed 
point is the initial algebra. 


Example 4.24. We illustrate that for a set functor F preserving monomorph- 
isms, the well-founded part of the terminal coalgebra is the initial algebra. 
Consider FX = A x X +1. The terminal coalgebra is the set A% U A* of finite 
and infinite sequences from the set A. The initial algebra is A*. It is easy to 
check that A* is the well-founded part of A® U A*. 


5 The General Recursion Theorem and its Converse 


The main consequence of well-foundedness is parametric recursivity. This is 
Taylor’s General Recursion Theorem [24, Theorem 6.3.13]. Taylor assumed that 
F preserves inverse images. We present a new proof for which it is sufficient that 
F preserves monomorphisms, assuming those are smooth. 


Theorem 5.1 (General Recursion Theorem). Let & be a complete and 
wellpowered category with smooth monomorphisms. For F: & — A preserving 
monomorphisms, every well-founded coalgebra is parametrically recursive. 


Proof sketch. (1) Let (A, a) be well-founded. We first prove that it is recursive. 
We use the subobjects a;: A; — A of Construction 4.204, the corresponding 


4 One might object to this use of transfinite recursion, since Theorem 5.1 itself could 
be used as a justification for transfinite recursion. Let us emphasize that we are 
not presenting Theorem 5.1 as a foundational contribution. We are building on the 
classical theory of transfinite recursion. 
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morphisms a(a;): Ajz1 = OA; > FA; (cf. Definition 4.3), and the recursive 
coalgebras (F"0, wi i+1) of Example 3.3(6). We obtain a natural transformation 
h from the chain (A;) in Construction 4.20 to the initial-algebra chain (F%0) (see 
Remark 2.13) by transfinite recursion. 

Now for every algebra e: FX — X, we obtain a unique coalgebra-to-algebra 
morphism f;: F’0 > X, i.e. we have that f; = e- Ffi- wii4i. Since (A, a) is 
well-founded, we know that a = a* = a(a;) for some i. From this it is not difficult 
to prove that fi- hi is a coalgebra-to-algebra morphism from (A, a) to (X,e). 

In order to prove uniqueness, we prove by transfinite induction that for any 

given coalgebra-to-algebra homomorphism ef, one has et - a; = fj hj +a; for 
every ordinal number j. Then for the above ordinal number i with a; = id4, we 
have et = f;- hi, as desired. This shows that (A, a) is recursive. 
(2) We prove that (A,qa) is parametrically recursive. Consider the coalgebra 
(a, ida): A > FA x A for F(—) x A. This functor preserves monomorphisms 
since F does and monomorphisms are closed under products. The next time 
operator © on Sub(A) is the same for both coalgebras since the square (4.1) is a 
pullback if and only if the square on the right below is one. 


Since id, is the unique fixed point of O 


w.r.t. F (see Definition 4.3), it is also the tei Grn) 
unique fixed point of O w.r.t. F(—) x A. a FSx A 
Thus, (A, (a, id,4)) is a well-founded coal- Om| [Pm A 


gebra for F(—) x A. By the previous ar- 
gument, this coalgebra is thus recursive for 
F'(—) x A; equivalently, (A, a) is parametrically recursive for F. 


Theorem 5.2. For every endofunctor on Set or Vecx (vector spaces and linear 
maps), every well-founded coalgebra is parametrically recursive. 


Proof sketch. For Set, we apply Theorem 5.1 to the Trnková hull F (see Proposi- 
tion 2.3), noting that F and F have the same (non-empty) coalgebras. Moreover, 
one can show that every well-founded (or recursive) F'-coalgebra is a well-founded 
(recursive, resp.) F-coalgebra. For Vecx, observe that monomorphisms split and 
are therefore preserved by every endofunctor F. 


Example 5.3. We saw in Example 4.11(3) that for FX =1+Ax Xx X 
the coalgebra (A, s) from Example 3.3(8) is well-founded, and therefore it is 
(parametrically) recursive. 


Example 5.4. Well-founded coalgebras need not be recursive when F does 
not preserve monomorphisms. We take ~& to be the category of sets with a 
predicate, i.e. pairs (X, A), where A C X. Morphisms f: (X, A) > (Y, B) satisfy 
f|A] © B. Denote by 1 the terminal object (1,1). We define an endofunctor 
F by F(X,0) = (X + 1,0), and for A # 0, F(X, A) = 1. For a morphism 
f: (X, A) > (Y, B), put Ff = f + id if A = Q; if AA, then also B 4 and 
Ff is id: 1> 1. 
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The terminal coalgebra is id: 1 — 1, and it is easy to see that it is well- 
founded. But it is not recursive: there are no coalgebra-to-algebra morphisms 
into an algebra of the form F'(X,0) > (X,9). 


We next prove a converse to Theorem 5.1: “recursive ==> well-founded”. 
Related results appear in Taylor [23, 24], Adámek et al. [3] and Jeannin et 
al. [15]. 

Recall universally smooth monomorphisms from Definition 2.8(2). A pre-fixed 
point of F is a monic algebra a: FA — A. 


Theorem 5.5. Let & be a complete and wellpowered category with universally 
smooth monomorphisms, and suppose that F: & —> @ preserves inverse images 
and has a pre-fixed point. Then every recursive coalgebra is well-founded. 


Proof. (1) We first observe that an initial algebra exists. This follows from results 
by Trnková et al. [25] as we now briefly recall. Recall the initial-algebra chain 
from Remark 2.13. Let 6: FB — B be a pre-fixed point. Then there is a unique 
cocone §;: F’0 > B satisfying 6:4, = 8-F'B;. Moreover, each 6; is monomorphic. 
Since B has only a set of subobjects, there is some AÀ such that for every i > A, 
all of the morphisms 6; represent the same subobject of B. Consequently, w),,+1 
of Remark 2.13 is an isomorphism, due to By = 8y41-wy,,41. Then pF = FO 
with the structure 1 = wy pee : F(uF) > uF is an initial algebra. 
(2) Now suppose that (A, a) is a recursive coalgebra. Then there exists a unique 
coalgebra homomorphism h: (A,a) —> (uF,i™t). Let us abbreviate w; by 
ci: F’0 > pF, and recall the subobjects a;: A; — A from Construction 4.20. 
We will prove by transfinite induction that a; is the inverse image of c; under h; in 
symbols: a; = h (c;) for all ordinals i. Then it follows that a) is an isomorphism, 
since so is c), whence (A, a) is well-founded. 

In the base case i = 0 this is clear since Ag = Wo = 0 is a strict initial object. 

For the isolated step we compute the pullback of cj41: Wi+1 > uF along h 
using the following diagram: 


Aji alai), FA; Pha 


> FW; 
mi raf ra| s 


APs FA s FP) ——> uF 
l h J 


By the induction hypothesis and since F preserves inverse images, the middle 
square above is a pullback. Since the structure map uz of the initial algebra is an 
isomorphism, it follows that the middle square pasted with the right-hand triangle 
is also a pullback. Finally, the left-hand square is a pullback by the definition of 
aj41. Thus, the outside of the above diagram is a pullback, as required. 
For a limit ordinal j, we know that aj = V;<; ai and similarly, cj = V;<j ĉi 
since W; = colim;<; W; and monomorphisms are smooth (see Remark 2. 1202) ; 
— 
Using Remark 2.12(3) and the induction hypothesis we thus obtain h (cj) = 


a 4 
h (Vie; Gi) = Vics BC) = Vics ti = aj. 
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Corollary 5.6. Let & and F satisfy the assumptions of Theorem 5.5. Then the 
following properties of a coalgebra are equivalent: 
1) well-foundedness, 
) parametric recursiveness, 
) recursiveness, 
4) existence of a homomorphism into (uF,.~'), 
) 


existence of a homomorphism into a well-founded coalgebra. 


Proof sketch. We already know (1) = (2) = (3). Since F has an initial algebra (as 
proved in Theorem 5.5), the implication (3) = (4) follows from Example 3.3(2). 
In Theorem 5.5 we also proved (4) = (1). The implication (4) => (5) follows 
from Example 4.5(2). Finally, it follows from [6, Remark 2.40] that (uF,.~+) is 
a terminal well-founded coalgebra, whence (5) = (4). 


Example 5.7. (1) The category of many-sorted sets satisfies the assumptions 
of Theorem 5.5, and polynomial endofunctors on that category preserve inverse 
images. Thus, we obtain Jeannin et al.’s result [15, Thm. 3.3] that (1)-(4) in 
Corollary 5.6 are equivalent as a special instance. 

(2) The implication (4) = (3) in Corollary 5.6 does not hold for vector spaces. 
In fact, for the identity functor on Vecg we have pld = (0, id). Hence, every 
coalgebra has a homomorphism into pJd. However, not every coalgebra is recursive, 
e.g. the coalgebra (K, id) admits many coalgebra-to-algebra morphisms to the 
algebra (K, id). Similarly, the implication (4) = (1) does not hold. 


We also wish to mention a result due to Taylor [23, Rem. 3.8]. It uses the concept 
of a subobject classifier originating in [17] and prominent in topos theory. This is 
an object 2 with a subobject t: 1 — 2 such that for every subobject b: BA 
there is a unique b: A > Q such that b is the inverse image of t under b. By 
definition, every elementary topos has a subobject classifier, in particular every 
category Set® with @ small. 

Our standing assumption that <@ is a complete and well-powered category is 
not needed for the next result: finite limits are sufficient. 


Theorem 5.8 (Taylor [23]). Let F be an endofunctor preserving inverse im- 
ages on a finitely complete category with a subobject classifier. Then every recursive 
coalgebra is well-founded. 


Corollary 5.9. For every set functor preserving inverse images, the following 
properties of a coalgebra are equivalent: 


well-foundedness <> parametric recursiveness <=> recursiveness. 


Example 5.10. The hypothesis in Theorems 5.5 and 5.8 that the functor 
preserves inverse images cannot be lifted. In order to see this, we consider the 
functor R: Set + Set of Example 2.2(4). It preserves monomorphisms but not 
inverse images. The coalgebra A = {0,1} with the structure a constant to (0,1) 
is recursive: given an algebra 8: RB — B, the unique coalgebra-to-algebra 
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homomorphism h: {0,1} > B is given by h(0) = h(1) = (d). But A is not 
well-founded: É is a cartesian subcoalgebra. 


Recall that an initial algebra (uF, 1) is also considered as a coalgebra (uF, 17+). 
Taylor [23, Cor. 9.9] showed that, for functors preserving inverse images, the 
terminal well-founded coalgebra is the initial algebra. Surprisingly, this result is 
true for all set functors. 


Theorem 5.11 [6, Thm. 2.46]. For every set functor, a terminal well-founded 
coalgebra is precisely an initial algebra. 


Theorem 5.12. For every functor on Vecx preserving inverse images, the fol- 
lowing properties of a coalgebra are equivalent: 


well-foundedness <> parametric recursiveness <=> recursiveness. 


6 Closure Properties of Well-founded Coalgebras 


In this section we will see that strong quotients and subcoalgebras (see Remark 2.7) 
of well-founded coalgebras are well-founded again. We mention the following 
corollary to Proposition 4.19. For endofunctors on sets preserving inverse images 
this was stated by Taylor [24, Exercise VI.16]: 


Proposition 6.1. The subcategory of Coalg F formed by all well-founded coal- 
gebras is closed under strong quotients and coproducts in Coalg F. 


This follows from a general result on coreflective subcategories [2, Thm. 16.8]: 
the category Coalg F has the factorization system of Proposition 2.6, and its 
full subcategory of well-founded coalgebras is coreflective with monomorphic 
coreflections (see Proposition 4.19). Consequently, it is closed under strong 
quotients and colimits. 

We prove next that, for an endofunctor preserving finite intersections, well- 
founded coalgebras are closed under subcoalgebras provided that the complete 
lattice Sub(A) is a frame. This means that for every subobject m: B > A and 
every family m; (i € I) of subobjects of A we have m A^ Vier Mmi = Vier(m ^mi). 
Equivalently, Mm: Sub(A) —> Sub(B) (see Notation 2.10) has a right adjoint 
m,: Sub(B) —> Sub(A). 

This property holds for Set as well as for the categories of posets, graphs, 
topological spaces, and presheaf categories Set”, € small. Moreover, it holds for 
every Grothendieck topos. The categories of complete partial orders and Vecg 
do not satisfy this requirement. 


Proposition 6.2. Suppose that F preserves finite intersections, and let (A, a) 
be a well-founded coalgebra such that Sub(A) a frame. Then every subcoalgebra 
of (A,a) is well-founded. 
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Proof. Let m: (B, 8) — (A,a) be a subcoalgebra. We will show that the only 
pre-fixed point of Og is idg (cf. Remark 4.4(2)). Suppose s: S >=> B fulfils 
O~g(s) < s. Since F preserves finite intersections, we have în - Oa = Og: m by 
Corollary 4.15(1). The counit of the above adjunction m -4 m, yields n(m. (s)) < 
s, so that we obtain m(OQa(m.(s))) = Og(m(m,(s))) < Oa(s) < s. Using again 
the adjunction M 4 m., we have equivalently that Qa(m.(s)) < m«(s); i.e. ms (s) 
is a pre-fixed point of Oa. Since (A, a) is well-founded, Corollary 4.15(1) implies 
that m.(s) = id,. Since im is also a right adjoint and therefore preserves the top 
element of Sub(B), we thus obtain idp = in(id4) = m(m.(s)) < s. 


Remark 6.3. Given a set functor F preserving inverse images, a much better 
result was proved by Taylor [24, Corollary 6.3.6]: for every coalgebra homo- 
morphism f: (B, 8) — (A,a) with (A, a) well-founded so is (B, 8). In fact, our 
proof above is essentially Taylor’s. 


Corollary 6.4. If a set functor preserves finite intersections, then subcoalgebras 
of well-founded coalgebras are well-founded. 


Trnková [26] proved that every set functor preserves all nonempty finite 
intersections. However, this does not suffice for Corollary 6.4: 


Example 6.5. A well-founded coalgebra for a set functor can have non-well- 
founded subcoalgebras. Let FØ = 1 and FX = 1+1 for all nonempty sets X, and 
let Ff = inl: 1 — 1+1 be the left-hand injection for all maps f: Ø — X with 
X nonempty. The coalgebra inr: 1 + F1 is not well-founded because its empty 
subcoalgebra is cartesian. However, this is a subcoalgebra of id: 1+1 —>1+1 
(via the embedding inr), and the latter is well-founded. 


The fact that subcoalgebras of a well-founded coalgebra are well-founded does 
not necessarily need the assumption that Sub(A) is a frame. Instead, one may 
assume that the class of morphisms is universally smooth: 


Theorem 6.6. If & has universally smooth monomorphisms and F preserves 
finite intersections, every subcoalgebra of a well-founded coalgebra is well-founded. 


7 Conclusions 


Well-founded coalgebras introduced by Taylor [24] have a compact definition based 
on an extension of Jacobs’ ‘next time’ operator. Our main contribution is a new 
proof of Taylor’s General Recursion Theorem that every well-founded coalgebra is 
recursive, generalizing this result to all endofunctors preserving monomorphisms 
on a complete and well-powered category with smooth monomorphisms. For 
functors preserving inverse images, we also have seen two variants of the converse 
implication “recursive => well-founded”, under additional hypothesis: one due 
to Taylor for categories with a subobject classifier, and the second one provided 
that the category has universally smooth monomorphisms and the functor has a 
pre-fixed point. Various counterexamples demonstrate that all our hypotheses 
are necessary. 


On Well-Founded and Recursive Coalgebras 35 


References 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


Adamek, J.: Free algebras and automata realizations in the language of categories. 
Comment. Math. Univ. Carolin. 15, 589-602 (1974) 


. Adámek, J., Herrlich, H., Strecker, G.E.: Abstract and Concrete Categories: The 


Joy of Cats. Dover Publications, 3rd edn. (2009) 

Adámek, J., Lücke, D., Milius, S.: Recursive coalgebras of finitary functors. Theor. In- 
form. Appl. 41(4), 447—462 (2007) 

Adámek, J., Milius, S., Moss, L.S.: Fixed points of functors. J. Log. Algebr. Methods 
Program. 95, 41-81 (2018) 

Adámek, J., Milius, S., Moss, L.S.: On well-founded and recursive coalgebras (2019), 
full version; available online at http: //arxiv.org/abs/1910.09401 

Adámek, J., Milius, S., Moss, L.S., Sousa, L.: Well-pointed coalgebras. Log. Methods 
Comput. Sci. 9(2), 1-51 (2014) 

Adámek, J., Milius, S., Sousa, L., Wi8mann, T.: On finitary functors. Theor. Appl. 
Categ. 34, 1134-1164 (2019). available online at https://arxiv.org/abs/1902.05788 
Adamek, J., Rosicky, J.: Locally Presentable and Accessible Categories. Cambridge 
University Press (1994) 

Borceux, F.: Handbook of Categorical Algebra: Volume 1, Basic Category Theory. 
Encyclopedia of Mathematics and its Applications, Cambridge University Press 
(1994) 

Capretta, V., Uustalu, T., Vene, V.: Recursive coalgebras from comonads. In- 
form. and Comput. 204, 437-468 (2006) 

Capretta, V., Uustalu, T., Vene, V.: Corecursive algebras: A study of general 
structured corecursion. In: Oliveira, M., Woodcock, J. (eds.) Formal Methods: 
Foundations and Applications, Lecture Notes in Computer Science, vol. 5902, pp. 
84-100. Springer Berlin Heidelberg (2009) 

Eppendahl, A.: Coalgebra-to-algebra morphisms. In: Proc. Category Theory and 
Computer Science (CTCS). Electron. Notes Theor. Comput. Sci., vol. 29, pp. 42-49 
(1999) 

Gumm, H.: From T-coalgebras to filter structures and transition systems. In: 
Fiadeiro, J.L., Harman, N., Roggenbach, M., Rutten, J. (eds.) Algebra and Coalgebra 
in Computer Science, Lecture Notes in Computer Science, vol. 3629, pp. 194-212. 
Springer Berlin Heidelberg (2005) 

Jacobs, B.: The temporal logic of coalgebras via Galois algebras. Math. Structures 
Comput. Sci. 12(6), 875-903 (2002) 

Jeannin, J.B., Kozen, D., Silva, A.: Well-founded coalgebras, revisited. Math. Struc- 
tures Comput. Sci. 27, 1111-1131 (2017) 

Kurz, A.: Logics for Coalgebras and Applications to Computer Science. Ph.D. thesis, 
Ludwig-Maximilians-Universitat München (2000) 

Lawvere, W.F.: Quantifiers and sheaves. Actes Congés Intern. Math. 1, 329-334 
(1970) 

Manna, Z., Pniieli, A.: The Temporal Logic of Reactive and Concurrent Systems: 
Specification. Springer-Verlag (1992) 

Meseguer, J., Goguen, J.A.: Initiality, induction, and computability. In: Algebraic 
methods in semantics (Fontainebleau, 1982), pp. 459-541. Cambridge Univ. Press, 
Cambridge (1985) 

Milius, S.: Completely iterative algebras and completely iterative monads. In- 
form. and Comput. 196, 1-41 (2005) 


36 


21. 


22: 


23. 


24. 


25. 


26. 


27. 


J. Adámek et al. 


Milius, S., Pattinson, D., WiBmann, T.: A new foundation for finitary corecursion 
and iterative algebras. Inform. and Comput. 217 (2020), available online at https: 
//doi.org/10.1016 /j.ic.2019.104456. 

Osius, G.: Categorical set theory: a characterization of the category of sets. J. Pure 
Appl. Algebra 4(79-119) (1974) 

Taylor, P.: Towards a unified treatment of induction I: the general recursion theorem 
(1995-6), preprint, available at www.paultaylor.eu/ordinals/#towuti 

Taylor, P.: Practical Foundations of Mathematics. Cambridge University Press 
(1999) 

Trnková, V., Adámek, J., Koubek, V., Reiterman, J.: Free algebras, input processes 
and free monads. Comment. Math. Univ. Carolin. 16, 339-351 (1975) 

Trnková, V.: Some properties of set functors. Comment. Math. Univ. Carolin. 10, 
323-352 (1969) 

Trnková, V.: On a descriptive classification of set functors I. Com- 
ment. Math. Univ. Carolin. 12, 143-174 (1971) 


Open Access This chapter is licensed under the terms of the Creative Commons 
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), 
which permits use, sharing, adaptation, distribution and reproduction in any medium or 
format, as long as you give appropriate credit to the original author(s) and the source, 


provide a link to the Creative Commons license and indicate if changes were made. 


The images or other third party material in this chapter are included in the chapter’s 


Creative Commons license, unless indicated otherwise in a credit line to the material. If 


material is not included in the chapter’s Creative Commons license and your intended 


use is not permitted by statutory regulation or exceeds the permitted use, you will need 


to obtain permission directly from the copyright holder. 


® 


Check for 
updates 


Timed Negotiations* 


S. Akshay! (2), Blaise Genest?, Loic Hélouét?, and Sharvik Mital! 


1 JIT Bombay, Mumbai, India {akshayss,sharky}@cse.iitb.ac.in 
2 Univ Rennes, CNRS, IRISA, Rennes, France blaise.genest@irisa.fr 
3 Univ Rennes, Inria, Rennes, France loic.helouet@inria.fr 


Abstract. Negotiations were introduced in [6] as a model for concurrent 
systems with multiparty decisions. What is very appealing with negotia- 
tions is that it is one of the very few non-trivial concurrent models where 
several interesting problems, such as soundness, i.e. absence of deadlocks, 
can be solved in PTIME [3]. In this paper, we introduce the model of 
timed negotiations and consider the problem of computing the minimum 
and the maximum execution times of a negotiation. The latter can be 
solved using the algorithm of [10] computing costs in negotiations, but 
surprisingly minimum execution time cannot. 

This paper proposes new algorithms to compute both minimum and 
maximum execution time, that work in much more general classes of ne- 
gotiations than [10], that only considered sound and deterministic nego- 
tiations. Further, we uncover the precise complexities of these questions, 
ranging from PTIME to Af -complete. In particular, we show that com- 
puting the minimum execution time is more complex than computing the 
maximum execution time in most classes of negotiations we consider. 


1 Introduction 


Distributed systems are notoriously difficult to analyze, mainly due to the ex- 
plosion of the number of configurations that have to be considered to answer 
even simple questions. A challenging task is then to propose models on which 
analysis can be performed with tractable complexities, preferably within poly- 
nomial time. Free choice Petri nets are a classical model of distributed systems 
that allow for efficient verification, in particular when the nets are 1-safe [4, 5]. 
Recently, [6] introduced a new model called negotiations for workflows and 
business processes. A negotiation describes how processes interact in a dis- 
tributed system: a subset of processes in a node of the system take a synchronous 
decisions among several outcomes. The effect of this outcome sends contribut- 
ing processes to a new set of nodes. The execution of a negotiation ends when 
processes reach a final configuration. Negotiations can be deterministic (once an 
outcome is fixed, each process knows its unique successor node) or not. 
Negotiations are an interesting model since several properties can be decided 
with a reasonable complexity. The question of soundness, i.e., deadlock-freedom: 
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whether from every reachable configuration one can reach a final configuration, 
is PSPACE-complete. However, for deterministic negotiations, it can be decided 
in PTIME [7]. The decision procedure uses reduction rules. Reduction techniques 
were originally proposed for Petri nets [2,8,11,16]. The main idea is to define 
transformations rules that produce a model of smaller size w.r.t. the original 
model, while preserving the property under analysis. In the context of negotia- 
tions, [7,3] proposed a sound and complete set of soundness-preserving reduction 
rules and algorithms to apply these rules efficiently. The question of soundness 
for deterministic negotiations was revisited in [9] and showed NLOGSPACE- 
complete using anti patterns instead of reduction rules. Further, they show that 
the PTIME result holds even when relaxing determinism [9]. Negotiation games 
have also been considered to decide whether one particular process can force ter- 
mination of a negotiation. While this question is EXPTIME-complete in general, 
for sound and deterministic negotiations, it becomes PTIME [12]. 


While it is natural to consider cost or time in negotiations (e.g. think of the 
Brexit negotiation where time is of the essence, and which we model as running 
example in this paper), the original model of negotiations proposed by [6] is 
only qualitative. Recently, [10] has proposed a framework to associate costs to 
the executions of negotiations, and adapt a static analysis technique based on 
reduction rules to compute end-to-end cost functions that are not sensitive to 
scheduling of concurrent nodes. For sound and deterministic negotiations, the 
end-to-end cost can be computed in O(n.(C + n)), where n is the size of the 
negotiation and C the time needed to compute the cost of an execution. Requir- 
ing soundness or determinism seems perfectly reasonable, but asking sound and 
deterministic negotiations is too restrictive: it prevents a process from waiting 
for decisions of other processes to know how to proceed. 


In this paper, we revisit time in negotiations. We attach time intervals to 
outcomes of nodes. We want to compute maximal and minimal executions times, 
for negotiations that are not necessarily sound and deterministic. Since we are 
interested in minimal and maximal execution time, cycles in negotiations can be 
either bypassed or lead to infinite maximal time. Hence, we restrict this study to 
acyclic negotiations. Notice that time can be modeled as a cost, following [10], 
and the maximal execution time of a sound and deterministic negotiation can 
be computed in PTIME using the algorithm from [10]. Surprisingly however, we 
give an example (Example 3) for which the minimal execution time cannot be 
computed in PTIME by this algorithm. 


The first contribution of the paper shows that reachability (whether at least 
one run of a negotiation terminates) is NP-complete, already for (untimed) deter- 
ministic acyclic negotiations. This implies that computing minimal or maximal 
execution time for deterministic (but unsound) acyclic negotiations cannot be 
done in PTIME (unless NP=PTIME). We characterize precisely the complex- 
ities of different decision variants (threshold, equality, etc.), with complexities 
ranging from (co-)NP-complete to AX. 

We thus turn to negotiations that are sound but not necessarily determinis- 
tic. Our second contribution is a new algorithm, not based on reduction rules, 
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to compute the maximal execution time in PTIME for sound negotiations. It is 
based on computing the maximal execution time of critical paths in the nego- 
tiations. However, we show that minimal execution time cannot be computed 
in PTIME for sound negotiations (unless NP=PTIME): deciding whether the 
minimal execution time is lower than T is NP-complete, even for T given in 
unary, using a reduction from a Bin packing problem. This shows that minimal 
execution time is harder to compute than maximal execution time. 

Our third contribution consists in defining a class in which the minimal exe- 
cution time can be computed in (pseudo) PTIME. To do so, we define the class 
of k-layered negotiations, for k fixed, that is negotiations where nodes can be or- 
ganized into layers of at most k nodes at the same depth. These negotiations can 
be executed without remembering more than k nodes at a time. In this case, we 
show that computing the maximal execution time is PTIME, even if the negoti- 
ation is neither deterministic nor sound. The algorithm, not based on reduction 
rules, uses the k-layer restriction in order to navigate in the negotiation while 
considering only a polynomial number of configurations. For minimal execution 
time, we provide a pseudo PTIME algorithm, that is PTIME if constants are 
given in unary. Finally, we show that the size of constants do matter: deciding 
whether the minimal execution time of a k-layered negotiation is less than T 
is NP-complete, when T is given in binary. We show this by reducing from a 
Knapsack problem, yet again emphasizing that the minimal execution time of a 
negotiation is harder to compute than its maximal execution time. 

This paper is organized as follows. Section 2 introduces the key ingredients of 
negotiations, determinism and soundness, known results in the untimed setting, 
and provides our running example modeling the Brexit negotiation. Section 3 
introduces time in negotiations, gives a semantics to this new model, and for- 
malizes several decision problems on maximal and minimal durations of runs in 
timed negotiations. We recall the main results of the paper in Section 4. Then, 
Section 5 considers timed execution problems for deterministic negotiations, Sec- 
tion 6 for sound negotiations, and section 7 for layered negotiations. Proof details 
for the last three sections are given in an extended version of this paper [1]. 


2 Negotiations: Definitions and Brexit example 


In this section, we recall the definition of negotiations, of some subclasses (acyclic 
and deterministic), as well as important problems (soundness and reachability). 


Definition 1 (Negotiation [6,10]). A negotiation over a finite set of pro- 
cesses P is a tuple N = (N,no, ny, Æ), where: 


— N is a finite set of nodes. Each node is a pair n = (Pa, Rn) where P, C P 
is a non empty set of processes participating in node n, and Rn is a finite 
set of outcomes of node n (also called results), with Rn, = {rs}. We denote 
by R the union of all outcomes of nodes in N. 

— no is the first node of the negotiation and ny is the final node. Every process 
in P participates in both no and nf. 
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no-backstop 


a 


deal agreed 


Fig. 1. A (sound but non-deterministic) negotiation modeling Brexit. 


— Foralln € N, Xn : Pa x Rn > 2% is a map defining the transition relation 
from node n, with Xn(p,r) = 0 iff n = nf,r = rf. We denote X : N x P x 
R —> 2N the partial map defined on U en ({n} x Pn X Rn), with X(n, p,a) = 
Xn(p,a) for all p,a. 


Intuitively, at a node n = (P, Rn) in a negotiation, all processes of P,, have 
to agree on a common outcome r chosen from Rn. Once this outcome r is chosen, 
every process p € P,, is ready to move to any node prescribed by ¥(n,p,r). A 
new node m can only start when all processes of P,,, are ready to move to m. 


Example 1. We illustrate negotiations by considering a simplified model of the 
Brexit negotiation, see Figure 1. There are 3 processes, P = { EU, PM, Pa}. At 
first EU decides whether or not to enforce a backstop in any deal (outcome back- 
stop) or not (outcome no-backstop). In the meantime, PM decides to proroge 
Pa, and Pa can choose or not to appeal to court (outcome court/no court). If it 
goes to court, then PM and Pa will take some time in court (c-meet, defend), 
before PM can meet EU to agree on a deal. Otherwise, Pa goes to recess, and 
PM can meet EU directly. Once EU and PM agreed on a deal, PM tries to 
convince Pa to vote the deal. The final outcome is whether the deal is voted, or 
whether Brexit is delayed. 


Definition 2 (Deterministic negotiations). A process p € P is determinis- 
tic iff, for every n E N and every outcome r of n, X(n,p,1) is a singleton. A ne- 
gotiation is deterministic iff all its processes are deterministic. It is weakly non- 
deterministic /9] (called weakly deterministic in [3]) iff, for every node n, one of 
the processes in P,, is deterministic. Last, it is very weakly non-deterministic [9] 
(called weakly deterministic in [6]) iff, for every n, every p E€ P, and every out- 
come r of n, there exists a deterministic process q such that q E€ Py for every 
n’ E€ X(n,p,r). 
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In deterministic negotiations, once an outcome is chosen, each process knows 
the next node it will be involved in. In (very-)weakly non-deterministic nego- 
tiations, the next node might depend upon the outcome chosen in other nodes 
by other processes. However, once the outcomes have been chosen for all cur- 
rent nodes, there is only one next node possible for each process. Observe that 
the class of deterministic negotiations is isomorphic to the class of free choice 
workflow nets [10]. In Example 1, the Brexit negotiation is non-deterministic, 
because process PM is non-deterministic. Indeed, consider outcomes c-meet: it 
allows two nodes, according to whether the backstop is enforced or not, which 
is a decision taken by process EU. 


Semantics: A configuration [3] of a negotiation is a mapping M : P > 2”. 
Intuitively, it tells for each process p the set M (p) of nodes p is ready to engage in. 
The semantics of a negotiation is defined in terms of moves from a configuration 
to the next one. The initial Mo and final Mp configurations, are given by Mo(p) = 
{no} and My(p) = Ú respectively for every process p € P. A configuration M 
enables node n if n € M(p) for every p € Pa. When n is enabled, a decision 
at node n can occur, and the participants at this node choose an outcome r € 
Rn. The occurrence of (n,r) produces the configuration M’ given by M’(p) = 
X(n, p,r) for every p € P, and M'(p) = M(p) for remaining processes in P \ Py. 
Moving from M to M’ after choosing (n, r) is called a step, denoted M 2% M'.A 
run of N is a sequence (71,71), (n2, r2).--(Nk, rg) such that there is a sequence of 
configurations Mo, M1,..., Mp and every (ni, ri) is a step between M;—ı and Mi. 
A run starting from the initial configuration and ending in the final configuration 
is called a final run. By definition, its last step is (ny, rf). 


An important class of negotiations in the context of timed negotiations is 
acyclic negotiations, where infinite sequence of steps is impossible: 


Definition 3 (Acyclic negotiations). The graph of a negotiation N is the 
labeled graph Gy = (V,E) where V = N, and E = {((n,(p,r),n’) | nw E€ 
X(n,p,r)}, with pairs of the form (p,r) being the labels. A negotiation is acyclic 
iff its graph is acyclic. We denote by Paths(Gy,) the set of paths in the graph of a 
negotiation. These paths are of form T = (no, (po, ro), 1) --- (Me—-1; (DE, TR); nk): 


The Brexit negotiation of Fig.1 is an example of acyclic negotiation. Despite 
their apparent simplicity, negotiations may express involved behaviors as shown 
with the Brexit example. Indeed two important questions in this setting are 
whether there is some way to reach a final node in the negotiation from (i) the 
initial node and (ii) any reachable node in the negotiation. 


Definition 4 (Soundness and Reachability). 


1. A negotiation is sound iff every run from the initial configuration can be 
extended to a final run. The problem of soundness is to check if a given 
negotiation is sound. 

2. The problem of reachability asks if a given negotiation has a final run. 
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Notice that the Brexit negotiation of Fig.1 is sound (but not deterministic). 
It seems hard to preserve the important features of this negotiation while being 
both sound and deterministic. The problem of soundness has received consider- 
able attention. We summarize the results about soudness in the next theorem: 


Theorem 1. Determining whether a negotiation is sound is PSPACE-Complete. 
For (very-)weakly non-deterministic negotiations, it is co-NP-complete [9]. For 
acyclic negotiations, it is in DP and co-NP-Hard [6]. Determining whether an 
acyclic weakly non-deterministic negotiation is sound is in PTIME [3, 9]. Fi- 
nally, deciding soundness for deterministic negotiations is NLOGSPACE-complete [9]. 


Checking reachability is NP-complete, even for deterministic acyclic negoti- 
ations (surprisingly, we did not find this result stated before in the literature): 


Proposition 1. Reachability is NP-complete for acyclic negotiations, even if 
the negotiation is deterministic. 


Proof (sketch). One can guess a run of size < |M] in polynomial time, and verify 
if it reaches ny, which gives the inclusion in NP. The hardness part comes from 
a reduction from 3-CNF-SAT that can be found in the proof of Theorem 3. 


k-Layered Acyclic Negotiations 


We introduce a new class of negotiations which has good algorithmic properties, 
namely k-layered acyclic negotiations, for k fixed. Roughly speaking, nodes of a 
k-layered acyclic negotiations can be arranged in layers, and these layers contain 
at most k nodes. Before giving a formal definition, we need to define the depth 
of nodes in NV. 

First, a path in a negotiation is a sequence of nodes no...ng such that for 
all i € {1,...,@—1}, there exists pi, ri with niga € Æ (ni, pi, ri). The length of a 
path no,...,ne is £. The depth depth(n) of a node n is the maximal length of a 
path from no to n (recall that M is acyclic, so this number is always finite). 


Definition 5. An acyclic negotiation is layered if for all node n, every path 
reaching n has length depth(n). An acyclic negotiation is k-layered if it is layered, 
and for all L € N, there are at most k nodes at depth £. 


The Brexit example of Fig. 1 is 6-layered. Notice that a layered negotiation 
is necessarily k-layered for some k < |N| — 2. Note also that we can always 
transform an acyclic negotiation M into a layered acyclic negotiation M’, by 
adding dummy nodes: for every node m € Æ (n, p, r) with depth(m) > depth(n)+ 
1, we can add several nodes n,,...n¢ with £ = depth(m) — (depth(n) + 1), and 
processes P,,, = {p}. We compute a new relation Æ’ such that V’(n,p,r) = 
{ni}, X(ne,p,r) = {m} and for every i € 1.L— 1, ¥(ni,p,r) = nizi. This 
transformation is polynomial: the resulting negotiation is of size up to |W] x 
|X| x |P|. The proof of the following Theorem can be found in [1]. 


Theorem 2. Let k € N+. Checking reachability or soundness for a k-layered 
acyclic negotiation N can be done in PTIME. 
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3 Timed Negotiations 


In many negotiations, time is an important feature to take into account. For 
instance, in the Brexit example, with an initial node starting at the begining of 
September 2019, there are 9 weeks to pass a deal till the 31°’ October deadline. 
We extend negotiations by introducing timing constraints on outcomes of 
nodes, inspired by timed Petri nets [14] and by the notion of negotiations with 
costs [10]. We use time intervals to specify lower and upper bounds for the 
duration of negotiations. More precisely, we attach time intervals to pairs (n,1r) 
where n is a node and r an outcome. In the rest of the paper, we denote by 
T the set of intervals with endpoints that are non-negative integers or oo. For 
convenience we only use closed intervals in this paper (except for oo), but the 
results we show can also be extended to open intervals with some notational 
overhead. Intuitively, outcome r can be taken at a node n with associated time 
interval [a,b] only after a time units have elapsed from the time all processes 
contributing to n are ready to engage in n, and at most b time units later. 


Definition 6. A timed negotiation is a pair (N,y) where N is a negotiation, 
andy: Nx R > T associates an interval to each pair (n,r) of node and outcome 
such that r € Rn. For a given node n and outcome r, we denote by y7 (n,r) (resp. 
y*(n,r)) the lower bound (resp. the upper bound) of y(n,r). 


Example 2. In the Brexit example, we define the following timed constraints y. 
We only specify the outcome names, as the timing only depends upon them. 
Backstop and no-backstop both take between 1 and 2 weeks: y(backstop) = 
y(no-backstop) = [1,2]. In case of no-court, recess takes 5 weeks y(recess) = 
[5,5], and PM can meet EU immediatly y(meet) = [0,0]. In case of court ac- 
tion, PM needs to spend 2 weeks in court y(c-meet) = [2,2], and depending on 
the court delay and decision, Pa needs between 3 (court overules recess) to 5 
(court confirms recess) weeks, y(defend) = [3,5]. Agreeing on a deal can take 
anywhere from 2 weeks to 2 years (104 weeks): y(deal agreed) = [2, 104]—some 
would say infinite time is even possible! It needs more time with the backstop, 
(deal w/backstop) = [5, 104]. All other outcomes are assumed to be immediate, 
i.e., associated with [0,0]. 


Semantics: A timed valuation is a map u : P — R2° that associates a non- 
negative real value to every process. A timed configuration is a pair (M, p) where 
M is a configuration and p a timed valuation. There is a timed step from (M, p) 


to (M’,u'), denoted (M, p) £2, (M'u), if (i) M £2 Mo’, (ii) p g Pr 


implies ps/(p) = u(p) (iii) 3d € y(n,r) such that Vp € P,, we have p’(p) = 
maxpep, H(p’) + d (d is the duration of node n). 


Intuitively a timed step (M, m) Kan (M', u’) depicts a decision taken at 


node n, and how long each process of P„ waited in that node before taking 
decision (n, r). The last process engaged in n must wait for a duration contained 
in y(n, r). However, other processes may spend a time greater than y+ (n, r). 
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A timed run is a sequence of steps p = (Mo, uo) <> (Mi, m1)... (Mp, Hk) 
where Mo is the initial configuration, olp) = 0 for every p € P, and each 
(Mi, ui) — (M41, Hi+1) is a timed step. It is final if M, = My. Its execution 
time 6(p) is defined as 6(p) = maxpep Lx(p). 

Notice that we only attached timing to processes, not to individual steps. 
With our definition of runs, timing on steps may not be monotonous (i.e., non- 
decreasing) along the run, while timing on processes is. Viewed by the lens of 
concurrent systems, the timing is monotonous on the partial orders of the system 
rather than the linearization. It is not hard to restrict paths, if necessary, to have 
a monotonous timing on steps as well. In this paper, we are only interested in 
execution time, which does not depend on the linearization considered. 


Given a timed negotiation M, we can now define the minimum and maximum 
execution time, which correspond to optimistic or pessimistic views: 


Definition 7. Let N be a timed negotiation. Its minimum execution time, de- 
noted mintime(N) is the minimal 6(p) over all final timed run p of N. We 
define the maximal execution time maztime(V) of N similarly. 


Given T € N, the main problems we consider in this paper are the following: 


— The mintime problem, i.e., do we have mintime(N) < T?. 

In other words, does there exist a final timed run p with ô(p) < T? 
— The maxtime problem, i.e., do we have maztime(NV) < T?. 

In other words, does 6(p) < T for every final timed run p? 


These questions have a practical interest : in the Brexit example, the question 
“is there a way to have a vote on a deal within 9 weeks ?” is indeed a minimum 
execution time problem. We also address the equality variant of these decision 
problems, i.e., mintime(N) = T : is there a final run of M that terminates 
in exactly T time units and no other final run takes less than T time units? 
Similarly for maxtime(N) = T. 


Example 3. We use Fig. 1 to show that it is not easy to compute the minimal 
execution time, and in particular one cannot use the algorithm from [10] to com- 
pute it. Consider the node n with P, = {PM, Pa} and Rn = {court, no_court}. 
If the outcome is court, then PM needs 2 weeks before (s)he can talk to EU 
and Pa needs at least 3 weeks before he can debate. However, if the outcome is 
no_court, then PM need not wait before (s)he can talk to EU, but Pa wastes 
5 weeks in recess. This means that one needs to remember different alternatives 
which could be faster in the end, depending on the future. On the other hand, 
the algorithm from [10] attaches one minimal time to process Pa, and one min- 
imal time to process PM. No matter the choices (0 or 2 for PM and 3 or 5 
for Pa), there will be futures in which the chosen number will over or underap- 
proximate the real minimal execution time (this choice is not explicit in [10])*. 


t the authors of [10] acknowledged the issue with their algorithm for mintime. 
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For maximum execution time, it is not an issue to attach to each node a unique 
maximal execution time. The reason for the asymmetry between minimal and 
maximal execution times of a negotiation is that the execution time of a path 
is Maxpep Hk(p), for upk the last timed valuation, which breaks the symmetry 
between min and max. 


4 High level view of the main results 


In this section, we give a high-level description of our main results. Formal 
statements can be found in the sections where they are proved. We gather in 
Fig. 2 the precise complexities for the minimal and the maximal execution time 
problems for 3 classes of negotiations that we describe in the following. Since we 
are interested in minimum and maximum execution time, cycles in negotiations 
can be either bypassed or lead to infinite maximal time. Hence, while we define 
timed negotiations in general, we always restrict to acyclic negotiations (such as 
Brexit) while stating and proving results. 

In [10], a PTIME algorithm is given to compute different costs for negoti- 
ations that are both sound and deterministic. One limitation of this result is 
that it cannot compute the minimum execution time, as explained in Example 
3. A second limitation is that the class of sound and deterministic negotiations 
is quite restrictive: it cannot model situations where the next node a process 
participates in depends on the outcome from another process, as in the Brexit 
example. We thus consider classes where one of these restrictions is dropped. 

We first consider (Section 5) negotiations that are deterministic, but with- 
out the soundness restriction. We show that for this class, no timed problem 
we consider can be solved in PTIME (unless NP=PTIME). Further, we show 
that the equality problems (maztime/mintime(N) = T), are complete for the 
complexity class DP, i.e., at the second level of the Boolean Hierarchy [15]. 

We then consider (Section 6) the class of negotiations that are sound, but not 
necessarily deterministic. We show that maximum execution time can be solved 
in PTIME, and propose a new algorithm. However, the minimum execution time 
cannot be computed in PTIME (unless NP=PTIME). Again for the mintime 
equality problem we have a matching DP-completeness result. 


Deterministic Sound k-layered 


Max < T | co-NP-complete (Thm. 3) 
Max = T'| DP-complete (Prop. 2 


PTIME (Prop. 3) PTIME (Thm. 6) 


pseudo-PTIME (Thm. 8) 
NP-complete** (Thm. 7) 
DP-complete* (Prop. 4) | pseudo-PTIME (Thm. 8) 


) 
Min < T| NP-complete (Thm. 3) |NP-complete* (Thm. 5) 
Min = T | DP-complete (Prop. 2) 


Fig. 2. Results for acyclic timed negotiations. DP refers to the complexity class, Dif- 
ference Polynomial time [15], the second level of the Boolean Hierarchy. 


* hardness holds even for very weakly non-deterministic negotiations, and T in unary. 
** hardness holds even for sound and very weakly non-deterministic negotiations. 
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Finally, in order to obtain a polytime algorithm to compute the minimum 
execution time, we consider the class of k-layered negotiations (see Section 7): 
Given k € N, we can show that maztime(N) can be computed in PTIME for 
k-layered negotiations. We also show that while the mintime(N) < T? problem 
is weakly NP-complete for k-layered negotiations, we can compute mintime(N) 
in pseudo-PTIME, i.e. in PTIME if constants are given in unary. 


5 Deterministic Negotiations 


We start by considering the class of deterministic acyclic negotiations. We show 
that both maximal and minimal execution times cannot be computed in PTIME 
(unless NP=PTIMB), as the threshold problems are (co-)NP-complete. 


Theorem 3. The mintime(N) < T decision problem is NP complete, and the 
maxtime(N) < T decision problem is co-NP-complete for acyclic deterministic 
timed negotiations. 


Proof. For mintime(N) < T, containment in NP is easy: we just need to guess a 
run p (of polynomial size as M is acyclic), consider the associated timed run p7 
where all decisions are taken at their earliest possible dates, and check whether 
6(p_) < T, which can be done in time O(|M|+log T). 

For the hardness, we give the proof in two steps. First, we start with a proof 
of Proposition 1 that reachability problem is NP-hard using reduction of 3-CNF 
SAT, i.e., given a formula ¢, we build a deterministic negotiation Mọ s.t. ¢ is 
satisfiable iff Mọ has a final run. In a second step, we introduce timings on this 
negotiation and show that mintime(N¢) < T iff ¢ is satisfiable. 

Step 1: Reducing 3-CNF-SAT to Reachability problem. 

Given a Boolean formula ¢ with variables v;, 1 <7 < n and clauses cj,1 < j < 
m, for each variable v; we define the sets of clauses 5,4 = {c} | v; is present in cj} 
and S;¢ = {c; | =v; is present in cj}. Clauses in S;, and S; are naturally 
ordered: c; < cj iff i < j. We denote these elements 5;4(1) < Si+(2) < .... 
Similarly for set S; s. 

Now, we construct a negotiation Mọ (as depicted in Figure 3) with a process 
V; for each variable v; and a process Cj for each clause cj: 


— Initial node no has a single outcome r taking each process Cj to node Lones, , 
and each process V; to node Loneéy,. 

— Lones; has three outcomes: if literal v; € cj, then t; is an outcome, taking 
Cj to Pairc, w, and if literal ~v; € cj, then fi is an outcome, taking Cj to 
Paire; ~v- 

— The outcomes of Lone,,are true and false. Outcome true brings V; to 
node Tloneé,y,,; and outcome false brings V; to node Floney, 1. 

— We have a node Tlone,,,; for each j < |S; +| and Flone,,,; for each j < |S; sl, 
with V; as only process. Let c, = S; (j). Node Tlone,, ; has two outcomes 
vton bringing V; to Tlone,, ;41 (or np if j = |S;4|), and vtoc;, bringing V; 
to Paire, wi. The two outcomes from Floney,,; are similar. 
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true false 


[2,2] [2,2] fi 


Tlonev;,ı Flonev;,ı fi 
vton vton 


Pare; sv, 


Fig. 3. A part of Nọ where clause c; is (i2 V ~i V ~iz) and clause cp is (i4 V ~i V is). 
Timing is [0,0] whereever not mentioned 


— Node Pair», has V; and C, as its processes and one outcome ctof which 
takes process C, to final node ny and process V; to Tlone,, j+1 (with c, = 
Sitlj)), or to np if 7 = |Si4|. Node Paire„~v; is defined in the same way 
from Flones, j- 


With this we claim that Nọ has a final run iff ¢ is satisfiable which completes 
the first step of the proof. We give a formal proof of this claim in Appendix A 
of [1]. Observe that the negotiation Mg constructed is deterministic and acyclic 
(but it is not sound). 


Step 2: Before we introduce timing on Ny, we introduce a new outcome r’ 
at no which takes all processes to nf. Now, the timing function y associated 
with Nọ is: y(no,r) = [2,2] and y(no,r’) = [3,3] and y(n,r) = [0,0], for all 
node n # no and all r € Rp. Then, mintime(Ny) < 2 iff ọ has a satisfiable 
assignment: if mintime(Ny) < 2, there is a run with decision r taken at no 
which is final. But existence of any such final run implies satisfiability of @. For 
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reverse implication, if ¢ is satisfiable, then the corresponding run for satisfying 
assignment takes 2 time units, which means that mintime(N%) < 2. 

Similarly, we can prove that the MaxTime problem is co-NP complete by 
changing y(no,r ) = [1,1] and asking if maatime(Ng) > 1 for the new Ny. The 
answer will be yes iff ¢ is satisfiable. 


We now consider the related problem of checking if mintime(NV) = T (or if 
maxtime(N) = T). These problems are harder than their threshold variant un- 
der usual complexity assumptions: they are DP-complete (Difference Polynomial 
time class, i.e., second level of the Boolean Hierarchy, defined as intersection of 
a problem in NP and one in co-NP [15]). 


Proposition 2. The mintime(N) = T and maxtime(N) = T decision prob- 
lems are DP-complete for acyclic deterministic negotiations. 


Proof. We only give the proof for mintime (the proof for maxtime is given in 
Appendix A of [1]). Indeed, it is easy to see that this problem is in DP, as it can 
be written as mintime(N) < T which is in NP and 7(mintime(NV) < T — 1)), 
which is in co-NP. To show hardness, we use the negotiation constructed in the 
above proof as a gadget, and show a reduction from the SAT-UNSAT problem 
(a standard DP-complete problem). 

The SAT-UNSAT Problem asks given two Boolean expressions ¢ and $, both 
in CNF forms with three literals per clause, is it true that ¢ is satisfiable and o 
is unsatisfiable? SAT-UNSAT is known to be DP-complete [15]. We reduce this 
problem to mintime(N) =T. 

Given ¢, ¢, we first make the corresponding negotiations Nọ and Ny as 
in the previous proof. Let no and ny be the initial and final nodes of Ny and 
no and ny be the initial and final nodes of M; + (Similarly, for other nodes we 
write ’ above the nodes to signify they belong to N4y.) 

In the negotiation N, go’, we introduce a new node nau, in which all the pro- 
cesses participate (see Figure 4). The node nay has a single outcome r’ which 
sends all the processes to ns. Also, for node No apart from the outcome r which 
sends all processes to different nodes, there is another outcome rag which sends 
all the processes to nau. Now we merge the nodes ns and no and call the merged 
node nsep. Also nodes no and ny now have all the processes of Ng and Ng 
participating in them. This merged process gives us a new negotiation N, ¢,¢’ mm 
which the structure above nsep is same as Ng while below it is same as Ng- 
Node nsep now has all the processes of Ng and N, 4° Participating in it. The 
outcomes of Nsep will be same as that of nọ (rau,7r). For both the outcomes of 
Nsep the processes corresponding to Nọ directly go to ny of the Ng 4. Similarly 
no of Ng 4° which is same no of Ng, sends processes een to Ny di- 
rectly i Digs for all its outcomes. We now define timing function y for N pp 
which is as follows: (Lone, ,7 r) = [1,1] for all v; € ¢ andr € {true, false}, 
(Nat, rhu) = [2,2] and y(n, r) = [0,0] for all other outcomes of nodes. With this 
construction, one can conclude that mintime(N, g) = 2 iff @ is satisfiable and 
¢ is unsatisfiable (see [1] for details). This completes the reduction and hence 
proves DP-hardness. 
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vton vton ctof ctof 


Structure 


of Ny 


Fig. 4. Structure of N; 4 


Finally, we consider a related problem of computing the min and max time. 
To consider the decision variant, we rephrase this problem as checking whether 
an arbitrary bit of the minimum execution time is 1. Perhaps surprisingly, we 
obtain that this problem goes even beyond DP, the second level of the Boolean 
Hierarchy and is in fact hard for AJ’ (second level of the polynomial hierarchy), 
which contains the entire Boolean Hierarchy. Formally, 


Theorem 4. Given an acyclic deterministic timed negotiation and a positive 
integer k,computing the kt bit of the mazimum/minimum execution time is 
AÈ -complete. 


Finally, we remark that if we were interested in the optimization variant and 
not the decision variant of the problem, the above proof can be adapted to show 
that these variants are OptP-complete (as defined in [13]). But as optimization 
is not the focus of this paper, we avoid formal details of this proof. 


6 Sound Negotiations 


Sound negotiations are negotiations in which every run can be extended to 
a final run, as in Fig. 1. In this section, we show that mazxtime(N) can be 
computed in PTIME for sound negotiations, hence giving PTIME complexi- 
ties for the maxtime(NV) < T? and maztime(N) = T? questions. However, we 
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show that mintime(N) < T is NP-complete for sound negotiations, and that 
mintime(N) = T is DP-complete, even if T is given in unary. 

Consider the graph Gy of a negotiation M. Let m = (no, (po, ro), ni): 
(nk, (Pk Tk); Nnk+1) be a path of Gw. We define the maximal execution time of 
a path m as the value 6*+(7) = Doico yt (ni ri). We say that a path a = 


. ( ? a 
(no, (po, ro), 1) +++ (ne, (pe, re), ne+1) is a path of some run p = (Mj, pı) sit 
+++ (Mk, ux) if ro, ..., re is a subword of r1, ..., rh. 


Lemma 1. Let N be an acyclic and sound timed negotiation. Then maxtime(N) 
= MaX7e Paths(Gy) 6+ (1) + a (nF, rf). 


Proof. Let us first prove that maxtime(N) > maxze Paths(Gy) Ô (7) +77 (ng, rf). 
Consider any path 7 of Gw, ending in some node n. First, as M is sound, we can 

compute a run py such that m is a path of pr, and p, ends in a configuration 

in which n is enabled. We associate with pp the timed run p} which asso- 

ciates to every node the latest possible execution date. We have easily (pf) > 

6+ (r), and then we obtain maxrePaths(Gw) (Pt) > MAXrePaths(Gy) ÔT (T). As 

maxtime(N) is the maximal duration over all runs, it is hence necessarily greater 

than MaX7E Paths(Gy) ôl oz) T yt (nf, re). 

We now prove that maztime(N) < maxrePatns(Gy) ÔT (n) +77 (nf, rf). Take 


(nira) 


any timed run p = (M1, p1) =>" --- (Mz, ux) of N with a unique maximal node 
np. We show that there exists a path 7 of p such that 6(p) < ô+ (r) by induction 
on the length k of p. The initialization is trivial for k = 1. Let k € N. Because np 
is the unique maximal node of p, we have ôt (p) = maxpep,,, Hk-1(P)+Y+ (nk, rk). 
We choose one pk—-ı maximizing upk—ı(p). Let l < k be the maximal index of a 
decision involving process pz—1 (i.e. pe-1 € Pae). Now, consider the timed run 
p' subword of p, but with ng as unique maximal node (that is, it is p where 
nodes n;,i > £ has been removed, but also where some nodes n;,i < l have been 
removed if they are not causally before ng (in particular, Pa, O Pa, = 9).) 

By definition, we have that d*(p) = 6*(p') + Y+ (ne, re) + Y% (ne, Tr). We 
apply the induction hypothesis on p’, and obtain a path 7’ of p’ ending in 
ne such that 6*(p’) + yt (ne,re) < t(n’). It suffices to consider path m = 
nm .(ne, (Pe—1, 7c), Mk) to prove the inductive step d*(p) < d+ (7) +Y% (nk, Tx). 

Thus maxtime(N) = max 6*(p) < maxze paths(Gy) ÔT (T) +7 (ny, Tf). 


Lemma 1 gives a way to evaluate the maximal execution time. This amounts 
to finding a path of maximal weight in an acyclic graph, which is a standard 
PTIME problem that can be solved using standard max-cost calculation. 


Proposition 3. Computing the maximal execution time for an acyclic sound 
negotiation N = (N,no,nf,X) can be done in time O(|N| + |’). 


A direct consequence is that maztime(N) < T and maxtime(N) = T prob- 
lems can be solved in polynomial time when M is sound. Notice that if M is 
deterministic but not sound, then Lemma 1 does not hold: we only have an 
inequality. 
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We now turn to mintime(N). We show that it is strictly harder to compute 
for sound negotiations than maztime( N). 


Theorem 5. mintime(N) < T is NP-complete in the strong sense for sound 
acyclic negotiations, even if N is very weakly non-deterministic. 


Proof (sketch). First, we can decide mintime(N) < T in NP. Indeed, one can 
guess a final (untimed) run p of size < |N|, consider p~ the timed run corre- 
sponding to p where all outcomes are taken at the earliest possible dates, and 
compute in linear time 6(p~), and check that d(p—) < T. 

The hardness part is obtained by reduction from the Bin Packing problem. 
The reduction is similar to Knapsack, that we will present in Thm. 7. The 
difference is that we use £ bins in parallel, rather than 2 processes, one for the 
weight and one for the value. The hardness is thus strong, but the negotiation 
is not k-layered for a bounded k (it is 2 + 1 bounded, with £ depending on the 
input). A detailed proof is given in Appendix B of [1]. 


We show that mintime(N) = T is harder to decide than mintimeW) < T, 
with a proof similar to Prop. 2. 


Proposition 4. The mintime(N) = T? decision problem is DP-complete for 
sound acyclic negotiations, even if it is very weakly non-deterministic. 


An open question is whether the minimal execution time can be computed in 
PTIME if the negotiation is both sound and deterministic. The reduction from 
Bin Packing does not work with deterministic (and sound) negotiations. 


7 k-Layered Negotiations 


In this section, we consider k-layeredness, a syntactic property that can be effi- 
ciently verified (see Section 2). 


7.1 Algorithmic properties 


Let k be a fixed integer. We first show that the maximum execution time can be 
computed in PTIME for k-layered negotiations. Let N; be the set of nodes at 
layer i. We define for every layer i the set S; of subsets of nodes X C N; which 
can be jointly enabled and such that for every process p, there is exactly one 
node n(X,p) in X with p € n(X,p). An element X in S; is a subset of nodes 
that can be selected by solving all non-determnism with an appropriate choice of 
outcomes. Formally, we define S; inductively. We start with Sp = {no}. We then 
define $;41 from the contents of layer S;: we have Y € Siyi iff Uney Pn = P 
and there exist X € S; and an outcome rm E Rm for every m € X, such that 
n E€ X(n(X,p),p,%m) for each n € Y and p € Phn. 


Theorem 6. Let k € Nt. Computing the maximum execution time for a k- 
layered acyclic negotiation N can be done in PTIME. More precisely, the worst- 
case time complexity is O(|P|-|N|**1). 
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Proof (Sketch). The first step is to compute S; layer by layer, by following its 
inductive definition. The set S; is of size at most 2", as |N;| < k by definition of 
k-layeredness. Knowing Sj, it is easy to build S;41 by induction. This takes time 
in O(|P||NV/|**1) : We need to consider all k-uples of outcomes for each layer. 
There can be |W]|® such tuples. We need to do that for all processes (|P|), and 
for all layers (at most ||). 

We then keep for each subset X € S; and each node n € X, the maximal 
time fi(n, X) € N associated with n and X. From S;+ı and fi, we inductively 
compute fi+ı in the following way: for all X € S; with successor Y € Si+ı 
for outcomes (rp)pep, we denote fi+ı(Y,n, X) = maxye p(n) fi(X,n(X,p)) + 
yt (n(X,p), rp). If there are several choices of (rp)pep leading to the same Y, 
we take r, with the maximal f;(X,n(X,p)) + yt (n(X,p),rp). We then define 
fi+ı(Y,n) = maxyes, fi+ı(Y,n, X). Again, the initialization is trivial, with 
fo({no}, no) = 0. The maximal execution time of N is f({nf}, np). 


We can bound the complexity precisely by O(d(V) - C(N) - || RIF"), with: 


— d(N) < |N] the depth of ny, that is the number of layers of M, and ||R|| is 
the maximum number of outcomes of a node, 

— CO(N) = max; |S;| < 2%, which we will call the number of contexts of N, and 
which is often much smaller than 2". 

— k* = maxxcy,s,|X| < k. We say that N is k*-thread bounded, meaning 
that there cannot be more that k* nodes in the same context X of any layer. 
Usually, k* is strictly smaller than k = max; |Nj|, as Ni = Uxes, X- 


Consider again the Brexit example Figure 1. We have (k + 1) = 7, while 
we have the depth d(M) = 6, the negotiation is k* = 3-thread bounded (k* is 
bounded by the number of processes), ||R|| = 2, and the number of contexts is 
at most C(N’) = 4 (EU chooses to enforce backstop or not, and Pa chooses to 
go to court or not). 


7.2 Minimal Execution Time 


As with sound negotiations, computing minimal time is much harder than com- 
puting the maximal time for k-layered negotiations: 


Theorem 7. Let k > 6. The Min < T problem is NP-Complete for k-layered 
acyclic negotiations, even if the negotiation is sound and very weakly non-deterministic. 


Proof. One can guess in polynomial time a final run of size < |W]. If the exe- 
cution time of this final run is smaller than T then we have found a final run 
witnessing mintime(N) < T. Hence the problem is in NP. 

Let us now show that the problem is NP-hard. We proceed by reduction from 
the Knapsack decision problem. Let us consider a set of items U = {u1,... Un} 
of respective values v1,...U, and weight w1,...,W, and a knapsack of maximal 
capacity W. The knapsack problem asks, given a value V whether there exists a 
subset of items U’ C U such that wet vi > V and such that cet wi < W. 
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Fig. 5. The negotiation encoding Knapsack 


We build a negotiation with 2n processes P = {p1,...pP2n}, as shown in 
Fig. 5. Intuitively, p;,i < n will serve to encode the value of selected items as 
timing, while p;, i > n will serve to encode the weight of selected items as timing. 

Concerning timing constraints for outcomes we do the following: Outcomes 
0, yes and no are associated with [0,0]. Outcome c; is associated with [w;, wi], 
the weight of u;. Last, outcome 6; is associated with a more complex function, 
such that >, b; < W iff 0, vi > V. For that, we set [Gree ed : amaw] for 
outcome b;, where Vmax is the largest value of an item, and V is the total value 
we want to reach at least. Also, we set [peren RIT w] for outcome a;. We 
set T = W, the maximal weight of the knapsack. 

Now, consider a final run p in M. The only choices in p are outcomes yes or 
no from C1, ..., Cn. Let I be the set of indices such that yes is the outcome from 
all C; in this path. We obtain 6(p) = max(})i¢7 @i + Vier bis Vier ci). We have 
d(p) ST = W iff cpu: < W, that is the sum of the weights is lower than 


maz )W maa —vi)W : 
W, and Lir ees, + Die, ew <W. That is, ne Uman nerts 
N: Umar — V, i.e. X jer Vi Z V. Hence, there exists a path p with 6(p) < T = W 


iff there exists a set of items of weight less than W and of value more than V. 


It is well known that Knapsack is weakly NP-hard, that is, it is NP-hard only 
when weights/values are given in binary. This means that Thm. 7 shows that 
minimum execution time < T is NP-hard only when T is given in binary. We 
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can actually show that for k-layered negotiations, the mintime(N) < T problem 
can be decided in PTIME if T is given in unary (i.e. if T is not too large): 


Theorem 8. Let k € N. Given a k-layered negotiation N and T written in 
unary, one can decide in PTIME whether the minimum execution time of N is 
<T. The worst-case time complexity is O(N] - |P|- (T - |N)*). 


Proof. We will remember for each layer i a set T; of functions 7 from nodes N; 
of layer į to a value in {1,...,7, L}. Basically, we have 7 € 7; if there exists a 
path p reaching X = {n € N; | T(n) # L}, and this path reaches node n € X 
after T(n) time units. As for $;, for all p, we should have a unique node n(r, p) 
such that p € n(7,p) and rT(n(7,p)) A L. Again, it is easy to initialize Jo = {To}, 
with To(no) = 0, and 79(n) = L for all n Æ no. 

Inductively, we build Ji+ı in the following way: Ti+ı € Ti+ı iff there exists a 
Ti € Ti and rp E€ Rn(r; p) for all p € P such that for all n with 7;41(n) # L, we 
have %j41(n) = max, T; (n(Ti, p)) + Y(n(Ti, p), Tp): 

We have that the minimum execution time for M is mine, T(n»), for n the 
depth of ns. There are at most T* functions 7 in any 7;, and there are at most 
|N] layers to consider, giving the complexity. 


As with Thm. 6, we can more accurately state the complexity as O(d(N) - 
C(N)-||R\|* -TE*-+). The k* — 1 is because we only need to remember minimal 
functions 7 € Ti: if r'(n) > T(n) for all n, then we do not need to keep 7’ in Jj. 
In particular, for the knapsack encoding in the proof of Thm. 7, we have k* = 3, 
||R|| = 2 and C(NV) = 4. Notice that if k is part of the input, then the problem 
is strongly NP-hard, even if T is given in unary, as e.g. encoding bin packing 
with £ bins result to a 2¢+ 1-layered negotiations. 


8 Conclusion 


In this paper, we considered timed negotiations. We believe that time is of the 
essence in negotiations, as examplified by the Brexit negotiation. It is thus im- 
portant to be able to compute in a tractable way the minimal and maximal 
execution time of negotiations. We showed that we can compute in PTIME 
the maximal execution time for acyclic negotiations that are either sound or 
k-layered, for k fixed. We showed that we cannot compute in PTIME the max- 
imal execution time for negotiations that are not sound nor k-layered, even if 
they are deterministic and acyclic (unless NP=PTIME). We also showed that 
surprisingly, computing the minimal execution time is much harder, with strong 
NP-hardness results in most of the classes of negotiations, contradicting a claim 
in [10]. We came up with a new reasonable class of negotiations, namely k-layered 
negotiations, which enjoys a pseudo PTIME algorithm to compute the minimal 
execution time. That is, the algorithm is PTIME when the timing constants 
are given in unary. We showed that this restriction is necessary, as the prob- 
lem becomes NP-hard for constants given in binary, even when the negotiation 
is sound and very weakly non-deterministic. The problem to know whether the 
minimal execution time can be computed in PTIME for deterministic and sound 
negotiation remains open. 
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Abstract. Cartesian differential categories are categories equipped with 
a differential combinator which axiomatizes the directional derivative. 
Important models of Cartesian differential categories include classical 
differential calculus of smooth functions and categorical models of the 
differential A-calculus. However, Cartesian differential categories cannot 
account for other interesting notions of differentiation such as the calcu- 
lus of finite differences or the Boolean differential calculus. On the other 
hand, change action models have been shown to capture these examples 
as well as more “exotic” examples of differentiation. However, change 
action models are very general and do not share the nice properties of 
a Cartesian differential category. In this paper, we introduce Cartesian 
difference categories as a bridge between Cartesian differential categories 
and change action models. We show that every Cartesian differential cat- 
egory is a Cartesian difference category, and how certain well-behaved 
change action models are Cartesian difference categories. In particular, 
Cartesian difference categories model both the differential calculus of 
smooth functions and the calculus of finite differences. Furthermore, ev- 
ery Cartesian difference category comes equipped with a tangent bundle 
monad whose Kleisli category is again a Cartesian difference category. 


Keywords: Cartesian Difference Categories - Cartesian Differential Cat- 
egories - Change Actions - Calculus Of Finite Differences - Stream Cal- 
culus. 


1 Introduction 


In the early 2000s, Ehrhard and Regnier introduced the differential A-calculus 
[10], an extension of the A-calculus equipped with a differential combinator ca- 
pable of taking the derivative of arbitrary higher-order functions. This develop- 
ment, based on models of linear logic equipped with a natural notion of “deriva- 
tive” [11], sparked a wave of research into categorical models of differentiation. 

One of the most notable developments in the area is the introduction of 
Cartesian differential categories [4] by Blute, Cockett and Seely, which provide an 
abstract categorical axiomatization of the directional derivative from differential 
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calculus. The relevance of Cartesian differential categories lies in their ability to 
model both “classical” differential calculus (with the canonical example being the 
category of Euclidean spaces and smooth functions between) and the differential 
A-calculus (as every categorical model for it gives rise to a Cartesian differential 
category [14]). However, while Cartesian differential categories have proven to 
be an immensely successful formalism, they have, by design, some limitations. 
Firstly, they cannot account for certain “exotic” notions of derivative, such as 
the difference operator from the calculus of finite differences [16] or the Boolean 
differential calculus [19]. This is because the axioms of a Cartesian differential 
category stipulate that derivatives should be linear in their second argument (in 
the same way that the directional derivative is), whereas these aforementioned 
discrete sorts of derivative need not be. Additionally, every Cartesian differential 
category is equipped with a tangent bundle monad [7, 15] whose Kleisli category 
can be intuitively understood as a category of generalized vector fields. This 
Kleisli category has an obvious differentiation operator which comes close to 
making it a Cartesian differential category, but again fails the requirement of 
being linear in its second argument. 


More recently, discrete derivatives have been suggested as a semantic frame- 
work for understanding incremental computation. This led to the development 
of change structures [6] and change actions [2]. Change action models have been 
successfully used to provide a model for incrementalizing Datalog programs [1], 
but have also been shown to model the calculus of finite differences as well as 
the Kleisli category of the tangent bundle monad of a Cartesian differential cate- 
gory. Change action models, however, are very general, lacking many of the nice 
properties of Cartesian differential categories (for example, addition in a change 
action model is not required to be commutative), even though they are verified 
in most change action models. As a consequence of this generality, the tangent 
bundle endofunctor in a change action model can fail to be a monad. 


In this work, we introduce Cartesian difference categories (Section 4.2), whose 
key ingredients are an infinitesimal extension operator and a difference combi- 
nator, whose axioms are a generalization of the differential combinator axioms 
of a Cartesian differential category. In Section 4.3, we show that every Cartesian 
differential category is, in fact, a Cartesian difference category whose infinites- 
imal extension operator is zero, and conversely how every Cartesian difference 
category admits a full subcategory which is a Cartesian differential category. In 
Section 4.4, we show that every Cartesian difference category is a change action 
model, and conversely how a full subcategory of suitably well-behaved objects of 
a change action model is a Cartesian difference category. In Section 6, we show 
that every Cartesian difference category comes equipped with a monad whose 
Kleisli category again a Cartesian difference category. Finally, in Section 5 we 
provide some examples of Cartesian difference categories; notably, the calculus 
of finite differences and the stream calculus. 
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2 Cartesian Differential Categories 


In this section, we briefly review Cartesian differential categories, so that the 
reader may compare Cartesian differential categories with the new notion of 
Cartesian difference categories which we introduce in the next section. For a full 
detailed introduction on Cartesian differential categories, we refer the reader to 
the original paper [4]. 


2.1 Cartesian Left Additive Categories 


Here we recall the definition of Cartesian left additive categories [4] — where 
“additive” is meant being skew enriched over commutative monoids, which in 
particular means that we do not assume the existence of additive inverses, i.e., 
“negative elements”. By a Cartesian category we mean a category X with chosen 
finite products where we denote the binary product of objects A and B by 
A x B with projection maps 7m: A x B > A and mı : Ax B- B and pairing 
operation (—,—), and the chosen terminal object as T with unique terminal 
maps!4:A>T. 


Definition 1. A left additive category [4] is a category X such that each 
hom-set X(A, B) is a commutative monoid with addition operation + : X(A, B) x 
X(A, B) > X(A, B) and zero element (called the zero map) 0 € X(A, B), such 
that pre-composition preserves the additive structure: (f +g)oh=foh+goh 
and Qo f =0. A map k in a left additive category is additive if post-composition 
by k preserves the additive structure: ko(f+g)=kof+kog andko0=0. 
A Cartesian left additive category [4] is a Cartesian category X which is 
also a left additive category such all projection maps to : Ax B > A and 
tı: Ax B — B are additive. 


We note that the definition given here of a Cartesian left additive category 
is slightly different from the one found in [4], but it is indeed equivalent. By [4, 
Proposition 1.2.2], an equivalent axiomatization is of a Cartesian left additive 
category is that of a Cartesian category where every object comes equipped 
with a commutative monoid structure such that the projection maps are monoid 
morphisms. This will be important later in Section 4.2. 


2.2 Cartesian Differential Categories 


Definition 2. A Cartesian differential category [4] is a Cartesian left ad- 
ditive category equipped with a differential combinator D of the form 
f:A>B 
Dif}: Ax A> B 


verifying the following coherence conditions: 


[CD.1] D[f +g] = D[f] + Dig] and D[0] = 0 
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[CD.2] DIF o (x,y + z) = Dif] o (x,y) + Df] 0 (,2) and Dif] (2,0) =0 
[CD.3] D[14] = mı and D[ro] = To o mı and D[m] = mı © mı 

[CD.4] D{(f,9)] = (D[f], D[g]) and D[!4] =!axa 

[CD.5] Dig o f] = Dig] o (f o 70, DIF) 

CD.6] D| -o y), (0, 2)) = DIF] o (x, 2) 


((2,y), (z,0)) = D [D[F]] 0 ((x, 2), (u, 0)) 


Note that here, following the more recent work on Cartesian differential cat- 
egories, we’ve flipped the convention found in [4], so that the linear argument is 
in the second argument rather than in the first argument. 

We highlight that by [7, Proposition 4.2], the last two axioms [CD.6] and 
[CD.7] have an equivalent alternative expression. 


D[ 
[CD.7] D[D/f] 


Lemma 1. In the presence of the other axioms, [CD.6] and [CD.7] are equiv- 
alent to: 


[CD.6.a] D [D[F]] o ((x, 0), (0, y)) = dl 
[CD.7.a] D [D[f]] o (x, y), (2, w)) = D [D[f] o (x, 2), (y, w)) 


As a Cartesian difference category is a generalization of a Cartesian differ- 
ential category, we leave the discussion of the intuition of these axioms for later 
in Section 4.2 below. We also refer to [4, Section 4] for a term calculus which 
may help better understand the axioms of a Cartesian differential category. The 
canonical example of a Cartesian differential category is the category of real 
smooth functions, which we will discuss in Section 5.1. Other interesting exam- 
ples of can be found throughout the literature such as categorical models of the 
differential -calculus [10, 14], the subcategory of differential objects of a tangent 
category [7], and the coKleisli category of a differential category [3, 4]. 


3 Change Action Models 


Change actions [1, 2] have recently been proposed as a setting for reasoning about 
higher-order incremental computation, based on a discrete notion of differentia- 
tion. Together with Cartesian differential categories, they provide the core ideas 
behind Cartesian difference categories. In this section, we quickly review change 
actions and change action models, in particular, to highlight where some of the 
axioms of a Cartesian difference category come from. For more details on change 
actions, we invite readers to see the original paper [2]. 


3.1 Change Actions 


Definition 3. A change action A in a Cartesian category X is a quintuple 
A=(A, AA, 4, +4,04) consisting of two objects A and AA, and three maps: 


A: AxAA>A +4: AA x AA > AA 04:7 > AA 


such that (AA, +4,04) is a monoid and @4: A x AA —> A is an action of AA 
on A, that is, the following equalities hold: 


ao (l4, 040!4) = 14 a (la X +4) = 64 0(Ba4 x laa) 
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For a change action A and given a pair of maps f : C — A and g : C — AA, 
we define f@qg:C > Aas fOqg = Daol f, g). Similarly, for maps h : C > AA 
and k : C + AA, define h +7 k = +4 0 (h, k). Therefore, that ®4 is an action 
of AA on A can be rewritten as: 


14a @q04 = 14 la z (laa +3 laa) = (14 Og 14a) Og LAA 


The intuition behind the above definition is that the monoid AA is a type of 
possible “changes” or “updates” that might be applied to A, with the monoid 
structure on AA representing the capability to compose updates. 

Change actions give rise to a notion of derivative, with a distinctly “discrete” 
flavour. Given change actions on objects A and B, a map f : A > B can be 
said to be differentiable when changes to the input (in the sense of elements 
of AA) are mapped to changes to the output (that is, elements of AB). In 
the setting of incremental computation, this is precisely what it means for f to 
be incrementalizable, with the derivative of f corresponding to an incremental 
version of f. 


Definition 4. Let A = (A, AA, 4, +4,04) and B = (B, AB, ®g, +g,0p) be 
change actions. For a map f : A > B, a map O|f] : Ax AA > AB is a 
derivative of f whenever the following equalities hold: 


[CAD.1] fo(x@qy) = for Sg (I[f] o (x, y)) 
[CAD.2] [f] o (x,y +42) = (Olf] o (x, y)) +z (O[f] o (x BTY, z)) and 
O[f] o (x, Ogo!B) = Opolaxaa 


The intuition for these axioms will be explained in more detail in Section 
4.2 when we explain the axioms of a Cartesian difference category. Note that 
although there is nothing in the above definition guaranteeing that any given 
map has at most a single derivative, the chain rule does hold. As a corollary, 
differentiation is compositional and therefore the change actions in X form a 
category. 


Lemma 2. Whenever O[f] and Ə|g] are derivatives for composable maps f and 
g respectively, then |g] o (f o no, O[f]) is a derivative for go f. 


3.2 Change Action Models 


Definition 5. Given a Cartesian category X, define its change actions category 
CAct(X) as the category whose objects are change actions in X and whose arrows 
f : A —> B are the pairs (f, O[f]), where f : A > B is an arrow in X and 
O|f] : Ax AA — AB is a derivative for f. The identity is (14,71), while 


composition of (f, O[f]) and (g, Alg]) is (g o f, Alg] © (f° 70, O[f]))- 


There is an obvious product-preserving forgetful functor € : CAct(X) > X 
sending every change action (A, AA,®,+,0) to its base object A and every 
map (f, Ə|[f]) to the underlying map f. As a setting for studying differentiation, 
the category CAct(X) is rather lacklustre, since there is no notion of higher 
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derivatives, so we will instead work with change action models. Informally, a 
change action model consists of a rule which for every object A of X associates 
a change action over it, and for every map a choice of a derivative. 


Definition 6. A change action model is a Cartesian category X is a product- 
preserving functor a: X —> CAct(X) that is a section of the forgetful functor E. 


For brevity, when A is an object of a change action model, we will write AA, 
®a, +a, and 04 to refer to the components of the corresponding change action 
a(A). Examples of change action models can be found in [2]. In particular, we 
highlight that a Cartesian differential category always provides a change model 
action. We will generalize this result, and show in Section 4.4 that a Cartesian 
difference category also always provides a change action model. 


4 Cartesian Difference Categories 


In this section, we introduce Cartesian difference categories, which are gener- 
alizations of Cartesian differential categories. Examples of Cartesian difference 
categories can be found in Section 5. 


4.1 Infinitesimal Extensions in Left Additive Categories 


We first introduce infinitesimal extensions, which is an operator that turns a map 
into an “infinitesimal” version of itself — in the sense that every map coincides 
with its Taylor approximation on infinitesimal elements. 


Definition 7. A Cartesian left additive category X is said to have an infinites- 
imal extension £ if every homset X(A, B) comes equipped with a monoid mor- 
phism £ : X(A, B) > X(A, B), that is, e(f + g) = e(f) +e(g) and <(0) = 0, and 
such that e(go f) = e(g)of and e(mo) = mo°e(lax gp) and e(n) = m1 0€(Laxs). 


Note that since e(g o f) = e(g) o f, it follows that e(f) = e(1g) 0 f and 
é(14) : A > A is an additive map (Definition 1). In light of this, it turns out 
that infinitesimal extensions can equivalently be described as a class of additive 
maps £4 : A —> A such that £4xB = EA X Ep. The equivalence is given by setting 
e(f) = £go f and ey = e(14). Furthermore, infinitesimal extensions equipped 
each object with a canonical change action structure: 


Lemma 3. Let X be a Cartesian left additive category with infinitesimal exten- 
sion £. For every object A, define the maps 84: AX A A as QA = To + elmi), 
+4: AxA — Aasamo+m, and0,:T —> A as 04 =0. Then (A, A, Oa, +4,04) 
is a change action in X. 


Proof. As mentioned earlier, that (A, +4,04) is a commutative monoid was 
shown in [4]. On the other hand, that ®4 is a change action follows from the 
fact that £ preserves the addition. | 


Cartesian Difference Categories 63 


Setting A = (A, A,®4, +4,04), we note that fOqg = fte(g) and f+qg= 
f +g, and so in particular +z = +. Therefore, from now on we will omit the 
subscripts and simply write @ and +. 

For every Cartesian left additive category, there are always at least two pos- 
sible infinitesimal extensions: 


Lemma 4. For any Cartesian left additive category X, 


1. Setting e(f) = 0 defines an infinitesimal extension on X and therefore in 
this case, DA = To and fOg=f. 

2. Setting e(f) = f defines an infinitesimal extension on X and therefore in 
this case, Da = +4 and fOg=ftg. 


We note that while these examples of infinitesimal extensions may seem triv- 
ial, they are both very important as they will give rise to key examples of Carte- 
sian difference categories. 


4.2 Cartesian Difference Categories 


Definition 8. A Cartesian difference category is a Cartesian left additive 
category with an infinitesimal extension € which is equipped with a difference 
combinator O of the form: 


f:AvoB 
O|f]: Ax A> B 


verifying the following coherence conditions: 


[Cd.0] fo (a+ e(y)) = fox +e (Af) o (x,y) 

[Cd.1] O[f + g] = O[f] + Əlg], A[0] = 0, and Ole(f)| = e(O[F]) 

[8.2] alfo (s, y +2) = Alflo (x,y) + OLf]o(c+e(y),2) and Lf] (x, 0) =0 
[CO.3] O[14] = mı and O[mo] = T1; To and O[m| = T1; To 

[Cd.4] O[(f,9)] = (İF, @lg]) and O[!4] =!ax 4 

[C8.5] Algo f] = Əla] o (f © 7, Alf) 

[Cd.6] Ə [Ə[F] o (x,y), (0, 2)) = ALF] o (x + ey), 2) 

[Cd.7] 8[A[f]  ((x,y), (2,0)) = 8 [ƏL] o (x, 2), (y, 0)) 


Before giving some intuition on the axioms [C@.0] to [C8.7], we first observe 
that one could have used change action notation to express [C8.0], [CO.2], and 
[C8.6] which would then be written as: 


[Cd.0] fo (a Gy) = (fox) & (A[f]o (x,y) 

[Cd.2] O[f] o (x,y + z) = Ə[f] o (x,y) + O[f] o (x B y, z) and O[f] o (x, 0) = 0 
[Cd.6] 8 [A[f]] 0 ((x,y), (0, 2)) = O[F] o (x S y, 2) 

And also, just like Cartesian differential categories, [CO.6] and [C0.7] have 


alternative equivalent expressions. 


Lemma 5. In the presence of the other axioms, [CO.6] and [CO.7] are equiv- 
alent to: 
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[Cd.6.a] Ə [Ə[f] o ((x,0), (0, y)) = ALF] o (x,y) 
[Cd.7.a] 8 [Ə] o (x,y), (2, w)) = 8 [ALF] o ((x, 2), (y, w)) 


Proof. The proof is essentially the same as [7, Proposition 4.2]. E 


The keen eyed reader will notice that the axioms of a Cartesian difference cat- 
egory are very similar to the axioms of a Cartesian differential category. Indeed, 
[Cd.1], [C3.3], [CO.4], [CO.5], and [C8.7] are the same as their Cartesian dif- 
ferential category counterpart. The axioms which are different are [CO.2] and 
[C8.6] where the infinitesimal extension € is now included, and also there is the 
new extra axiom [C@.0]. On the other hand, interestingly enough, [C0.6.a] is 
the same as [CD.6.a]. We also point out that writing out [CO.0] and [Cd.2] 
using change action notion, we see that these axioms are precisely [CAD.1] and 
[CAD.2] respectively. To better understand [C8.0] to [CQ.7] it may be useful 
to write them out using element-like notation. In element-like notation, [C3.0] 
is written as: 


f(x + ely)) = f(a) + e (lf 2, 9) 


This condition can be read as a generalization of the Kock-Lawvere axiom that 
characterizes the derivative in from synthetic differential geometry [13]. Broadly 
speaking, the Kock-Lawvere axiom states that, for any map f : R — R and any 
x € R and d E D, there exists a unique f'(x) € R verifying 


f(z +d) = f(x) +d- f'(x) 


where D is the subset of R consisting of infinitesimal elements. It is by analogy 
with the Kock-Lawvere axiom that we refer to £ as an “infinitesimal extension” 
as it can be thought of as embedding the space A into a subspace e(A) of 
infinitesimal elements. 

[C3.1] states that the differential of a sum of maps is the sum of differentials, 
and similarly for zero maps and the infinitesimal extension of a map. [C@.2] is 
the first crucial difference between a Cartesian difference category and a Carte- 
sian differential category. In a Cartesian differential category, the differential of 
a map is assumed to be additive in its second argument. In a Cartesian differ- 
ence category, just as derivatives for change actions, while the differential is still 
required to preserve zeros in its second argument, it is only additive “up to a 
small perturbation”, that is: 


Olf, y +z) = Ilf] (z, y) + Olf](a + ely), z) 


[C8.3] tells us what the differential of the identity and projection maps are, 
while [C3.4] says that the differential of a pairing of maps is the pairing of their 
differentials. [CO.5] is the chain rule which expresses what the differential of a 
composition of maps is: 


Algo fl(x,y) = Əl] F), ALF, y)) 


[C98.6] and [C9.7] tell us how to work with second order differentials. [CO.6] 
is expressed as follows: 


ð [A[fl] (z, y, 0, z) = O[f](# + ely), 2) 
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and finally [C@.7] is expressed as: 


ð [Ol fll (x,y, Z, 0) =0 [olf] (x, z,y, 0) 


It is interesting to note that while [C8.6] is different from [CD.6], its alternative 
version [C0.6.a] is the same as [CD.6.a]. 


a [A[fl] (x, 0), (0, y)) = lf] (z, 2) 


4.3 Another look at Cartesian Differential Categories 


Here we explain how a Cartesian differential category is a Cartesian difference 
category where the infinitesimal extension is given by zero. 


Proposition 1. Every Cartesian differential category X with differential com- 
binator D is a Cartesian difference category where the infinitesimal extension is 
defined as e( f) = 0 and the difference combinator is defined to be the differential 
combinator, © =D. 


Proof. As noted before, the first two parts of the [CO.1], the second part of 
[Cd.2], [CO.3], [CO.4], [CO.5], and [C8.7] are precisely the same as their 
Cartesian differential axiom counterparts. On the other hand, since e(f) = 0, 
[C9.0] and the third part of [CO.1] trivial state that 0 = 0, while the first 
part of [C8.2] and [C9.6] end up being precisely the first part of [CD.2] and 
[CD.6]. Therefore, the differential combinator satisfies the Cartesian difference 
axioms and we conclude that a Cartesian differential category is a Cartesian 
difference category. E 


Conversely, one can always build a Cartesian differential category from a 
Cartesian difference category by considering the objects for which the infinites- 
imal extension is the zero map. 


Proposition 2. For a Cartesian difference category X with infinitesimal esten- 
sion € and difference combinator ð, then Xo, the full subcategory of objects A 
such that e(14) = 0, is a Cartesian differential category where the differential 
combinator is defined to be the difference combinator, D = 0. 


Proof. First note that if e(14) = 0 and e(1g) = 0, then by definition it also 
follows that e(l14xB) = 0, and also that for the terminal object e(11+) = 0 
by uniqueness of maps into the terminal object. Thus Xo is closed under finite 
products and is therefore a Cartesian left additive category. Furthermore, we 
again note that since e( f) = 0, this implies that for maps between such objects 
the Cartesian difference axioms are precisely the Cartesian differential axioms. 
Therefore, the difference combinator is a differential combinator for this subcat- 
egory, and so Xo is a Cartesian differential category. a 
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In any Cartesian difference category X, the terminal object T always satisfies 
that e(17) = 0, and so therefore, Xo is never empty. On the other hand, applying 
Proposition 2 to a Cartesian differential category results in the entire category. 
It is also important to note that the above two propositions do not imply that 
if a difference combinator is a differential combinator then the infinitesimal ex- 
tension must be zero. In Section 5.3, we provide such an example of a Cartesian 
differential category that comes equipped with a non-zero infinitesimal extension 
such that the differential combinator is a difference combinator with respect to 
this non-zero infinitesimal extension. 


4.4 Cartesian Difference Categories as Change Action Models 


In this section, we show how every Cartesian difference category is a particu- 
larly well-behaved change action model, and conversely how every change action 
model contains a Cartesian difference category. 


Proposition 3. Let X be a Cartesian difference category with infinitesimal ex- 
tension £ and difference combinator 0. Define the functor a: X —> CAct(X) as 
a(A) = (A,A,®a,+.a,0,) (as defined in Lemma 8) and a(f) = (f, O[f]). Then 
(X,a:X— CAct(X)) is a change action model. 


Proof. By Lemma 3, (A, A,®4,+.4,0,4) is a change action and so a is well- 
defined on objects. While for a map f, O[f] is a derivative of f in the change 
action sense since [CQ.0] and [C3.2] are precisely [CAD.1] and [CAD.2], 
and so a is well-defined on maps. That a preserves identities and composition 
follows from [C8.3] and [C8.5] respectively, and so a is a functor. That a 
preserves finite products will follow from [C@.3] and [C0.4]. Lastly, it is clear 
that a section of the forgetful functor, and therefore we conclude that (X, œ) is 
a change action model. | 


It is clear that not every change action model is a Cartesian difference cat- 
egory. For example, change action models do not require the addition to be 
commutative. On the other hand, it can be shown that every change action 
model contains a Cartesian difference category as a full subcategory. 


Definition 9. Let (X,a : X — CAct(X)) be a change action model. An object A 
is flat whenever the following hold: 


[F.1] AA=A 

[F.2] a(@a) = (D4, BA © 71) 

[F.3] 094 (094 f) =094 f for any f : U > A. 

[F.4] G4 is right-injective, that is, if Da0 (f, g9) = ao (f, h) then g=h. 


We would like to show that for any change action model (X, a), its full sub- 
category of flat objects, Flat, is a Cartesian difference category. Starting with 
the finite product structure, since a preserves finite products, it is straightfor- 
ward to see that T is Euclidean and if A and B are flat then so is A x B. The 
sum of maps f : A > Band g : A > B in Flat, is defined using the change 
action structure f +p g, while the zero map 0: A —> B is 0 = 0go!4. And so we 
obtain that: 
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Lemma 6. Flat, is a Cartesian left additive category. 


Proof. Most of the Cartesian left additive structure is straightforward. However, 
since the addition is not required to be commutative for arbitrary change actions, 
we will show that the addition is commutative for Euclidean objects. Using that 
®p is an action, that by [F.2] we have that ®p o mı is a derivative for @g, and 
[CAD.1], we obtain that: 


08 Op (f +B 9) = (08 Os f) 8B 9g = (08 OB g) Ow f = 0B B (g +B f) 


By [F.4], @z is right-injective and we conclude that f+g= 9+ f. a 


As an immediate consequence We note that for any change action model 
(X, a), since the terminal object is always flat, Flat, is never empty. 

We use the action of the change action structure to define the infinitesimal 
extension. So for a map f : A > B in Flata, define e( f) : A > B as follows: 


e(f) = GB o (Opola, f) =0 9B f 
Lemma 7. £ is an infinitesimal extension for Flata. 


Proof. We show that £ preserve the addition. Following the same idea as in the 
proof of Lemma 6, we obtain the following: 


Op Op E(f +B 9) = 0B Op (OB Op (f +8 9)) 
= (0B ®g Og) Op ((0B OB f) OB g) = (0B OB (OB OB f)) OB (OB OB 9) 
= (0p Gp e(f)) Sg elg) = Op Sp (E(f) +B €(9)) 


Then by [F.3], it follows that e( f +g) = e(f)+e(g). The remaining infinitesimal 
extension axioms are proven in a similar fashion. | 


Lastly, the difference combinator for Flat, is defined in the obvious way, that 
is, O[f] is defined as the second component of a(f). 


Proposition 4. Let (X,a : X — CAct(X)) be a change action model. Then 
Flat, is a Cartesian difference category. 


Proof (Sketch). The full calculations will appear in an upcoming extended jour- 
nal version of this paper, but we give an informal explanation. [C0.0] and 
[CO.2] are a straightforward consequences of [CAD.1] and [CAD.2]. [C3.3] 
and [C@.4] follow trivially from the fact that a preserves finite products and from 
the structure of products in CAct(X), while [C.5] follows from composition in 
CAct(X). [C8.1], [CO.6] and [C8.7] are obtained by mechanical calculation in 
the spirit of Lemma 6. Note that every axiom except for [CO.6] can be proven 
without using [F.3] 
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4.5 Linear Maps and e-Linear Maps 


An important subclass of maps in a Cartesian differential category is the subclass 
of linear maps [4, Definition 2.2.1]. One can also define linear maps in a Cartesian 
difference category by using the same definition. 


Definition 10. In a Cartesian difference category, a map f is linear if the 
following equality holds: O[f] = f o mı. 


Using element-like notation, a map f is linear if O[f](x,y) = f(y). Linear 
maps in a Cartesian difference category satisfy many of the same properties 
found in [4, Lemma 2.2.2]. 


Lemma 8. In a Cartesian difference category, 


. Iff: A> B is linear then e(f) = f oe(1a); 

If f: A> B is linear, then f is additive (Definition 1); 

. Identity maps, projection maps, and zero maps are linear; 

. The composite, sum, and pairing of linear maps is linear; 

If f:A— Bandk:C-—- D are linear, then for any map g : B > C, the 
following equality holds: Ə|k o go f| = ko O[g]o(f x f); 

6. If an isomorphism is linear, then its inverse is linear; 

7. For any object A, 4 and +, are linear. 


SENC 


Using element-like notation, the first point of the above lemma says that if 
f is linear then f(e(x)) = e(f(x)). And while all linear maps are additive, the 
converse is not necessarily true, see [4, Corollary 2.3.4]. However, an immediate 
consequence of the above lemma is that the subcategory of linear maps of a 
Cartesian difference category has finite biproducts. 

Another interesting subclass of maps is the subclass of -linear maps, which 
are maps whose infinitesimal extension is linear. 


Definition 11. In a Cartesian difference category, a map f is c-linear if <(f) 
is linear. 


Lemma 9. In a Cartesian difference category, 


1. If f: A B is e-linear then fo(x+e(y))=forte(f)oy; 
2. Every linear map is €-linear; 

3. The composite, sum, and pairing of ¢-linear maps is £-linear; 
4. If an isomorphism is £-linear, then its inverse is again €-linear. 


Using element-like notation, the first point of the above lemma says that if 
f is -linear then f(x +e(y)) = f(x) +e(f(y)). So e-linear maps are additive on 
“infinitesimal” elements (i.e. those of the form e(y)). 

For a Cartesian differential category, linear maps in the Cartesian difference 
category sense are precisely the same as the Cartesian differential category sense 
[4, Definition 2.2.1], while every map is ¢-linear since £ = 0. 
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5 Examples of Cartesian Difference Categories 


5.1 Smooth Functions 


Every Cartesian differential category is a Cartesian difference category where the 
infinitesimal extension is zero. As a particular example, we consider the category 
of real smooth functions, which as mentioned above, can be considered to be the 
canonical (and motivating) example of a Cartesian differential category. 

Let R be the set of real numbers and let SMOOTH be the category whose 
objects are Euclidean spaces R” (including the point R° = {«}), and whose 
maps are smooth functions F : R” — R™. SMOOTH is a Cartesian left additive 
category where the product structure is given by the standard Cartesian product 
of Euclidean spaces and where the additive structure is defined by point-wise 
addition, (F + G)(a) = F(a) + G(x) and O(a) = (0,...,0), where æ € R”. 
SMOOTH is a Cartesian differential category where the differential combinator 
is defined by the directional derivative of smooth functions. Explicitly, for a 
smooth function F : R” — R”, which is in fact a tuple of smooth functions 
F = (fi, ---, fn) where fi : R” > R, D[F] : R” x R” > R” is defined as follows: 


n að n a E 
DIF] (æ, y) = ( Ce D E on) 
4 i=1 i 


i=l 


where x = (21,..-,2n),y = (Y1,---,Yn) E R”. Alternatively, D[F] can also be 
defined in terms of the Jacobian matrix of F. Therefore SMOOTH is a Carte- 
sian difference category with infinitesimal extesion € = 0 and with difference 
combinator D. Since € = 0, the induced action is simply £ g» y = x. Also a 
smooth function is linear in the Cartesian difference category sense precisely if 
it is R-linear in the classical sense, and every smooth function is £-linear. 


5.2 Calculus of Finite Differences 


Here we explain how the difference operator from the calculus of finite differences 
gives an example of a Cartesian difference category but not a Cartesian differ- 
ential category. This example was the main motivating example for developing 
Cartesian difference categories. The calculus of finite differences is captured by 
the category of abelian groups and arbitrary set functions between them. 

Let Ab be the category whose objects are abelian groups G (where we use 
additive notation for group structure) and where a map f : G > H is simply 
an arbitrary function between them (and therefore does not necessarily preserve 
the group structure). Ab is a Cartesian left additive category where the product 
structure is given by the standard Cartesian product of sets and where the 
additive structure is again given by point-wise addition, (f+g)(x) = f(x)+g(z) 
and O(x) = 0. Ab is a Cartesian difference category where the infinitesimal 
extension is simply given by the identity, that is, e( f) = f, and and where the 
difference combinator ð is defined as follows for a map f : Œ > H: 


Olfl(z,y) = fle +y) — f(z) 
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On the other hand, ð is not a differential combinator for Ab since it does not 
satisfy [CD.6] and part of [CD.2]. Thanks to the addition of the infinitesimal 
extension, 0 does satisfy [C@.2] and [C0.6], as well as [CO.0]. However, as 
noted in [5], it is interesting to note that this 0 does satisfy [CD.1], the second 
part of [CD.2], [CD.3], [CD.4], [CD.5], [CD.7], and [CD.6.a]. It is worth 
noting that in [5], the goal was to drop the addition and develop a “non-additive” 
version of Cartesian differential categories. 

In Ab, since the infinitesimal operator is given by the identity, the induced 
action is simply the addition, x6gy = x+y. On the other hand, the linear maps 
in Ab are precisely the group homomorphisms. Indeed, f is linear if O[f](x, y) = 
f(y). But by [C8.0] and [C0.2], we get that: 


fæ +y) = f(a) + alfe, y) = fle) + Fy) f(0) = A[f](w, 0) = 0 


So f is a group homomorphism. Conversely, if f is a group homomorphism: 


olf (x,y) = fæ +y) — F) = fle) + Fy) — f) = Fly) 


So f is linear. Since e( f) = f, the e-linear maps are precisely the linear maps. 


5.3 Module Morphisms 


Here we provide a simple example of a Cartesian difference category whose dif- 
ference combinator is also a differential combinator, but where the infinitesimal 
extension is neither zero nor the identity. 

Let R be a commutative semiring and let MODp be the category of R- 
modules and R-linear maps between them. MOD, has finite biproducts and is 
therefore a Cartesian left additive category where every map is additive. Every 
r € R induces an infinitesimal extension e” defined by scalar multiplication, 
e'(f)(m) = rf(m). Then MODp is a Cartesian difference category with the 
infinitesimal extension £" for any r € R and difference combinator ð defined as: 


A[f](m,n) = f(n) 


R-linearity of f assures that [CO.0] holds, while the remaining Cartesian dif- 
ference axioms hold trivially. In fact, ô is also a differential combinator and 
therefore MOD is also a Cartesian differential category. The induced action is 
given by m@yn=m-+rn. By definition of 0, every map in MODx is linear, 
and by definition of e" and R-linearity, every map is also ¢-linear. 


5.4 Stream calculus 


Here we show how one can extend the calculus of finite differences example 
to stream calculus. The differential calculus of causal functions and interesting 
applications have recently been studying in [17, 18]. 

For a set A, let A” denote the set of infinite sequences of elements of A, 
where we write [a;] for the infinite sequence [a;] = (a1, @2,a3,...) and aj;.; for 


Cartesian Difference Categories 71 


the (finite) subsequence (ai, a;41,...,a@;). A function f : AY > B® is causal 
whenever the n-th element f ([a;]),, of the output sequence only depends on the 
first n elements of [a;], that is, f is causal if and only if whenever ao:n = bo:n 
then f ([ai])o., = f ([bi])o:,. We now consider streams over abelian groups, so 
let Ab” be the category whose objects are all the Abelian groups and whose 
morphisms are causal maps from G” to H”. Ab’ is a Cartesian left-additive 
category, where the product is given by the standard product of abelian groups 
and where the additive structure is lifted point-wise from the structure of Ab, 
that is, (f + g) (lail), = f ([ail),, + g ({a:]),, and 0 ([a:]),, = 0. In order to define 
the infinitesimal extension, we first need to define the truncation operator z. So 
let G be an abelian group and [a;] E G”, then define the sequence z([a;]) as: 


z([ai])o =0 Z (le) ni = ün+1 


The category Ab” is a Cartesian difference category where the infinitesimal ex- 
tension is given by the truncation operator, e(f) ({a:]) = z (f ([ai])), 
and where the difference combinator ð is defined as follows: 


[Ff] (lai) bilo = F (Lai) + [bio — F (Lai))o 
ALF] (lai); [nga = F Clai] + Anaa = Fe nga 


Note the similarities between the difference combinator on Ab and that on Ab’. 
The induced action is computed out to be: 


({ai] © [bi])o = ao ([ai] © [Bi] n41 = an+1 + bn4i 


A causal map is linear (in the Cartesian difference category sense) if and only 
if it is a group homomorphism. While a causal map f is ¢-linear if and only if 
it is a group homomorphism which does not the depend on the 0-th term of its 
input, that is, f ([a;]) = f (a([a:])). 


6 Tangent Bundles in Cartesian Difference Categories 


In this section, we show that the difference combinator of a Cartesian difference 
category induces a monad, called the tangent monad, whose Kleisli category 
is again a Cartesian difference category. This construction is a generalization 
of the tangent monad for Cartesian differential categories [7, 15]. However, the 
Kleisli category of the tangent monad of a Cartesian differential category is not 
a Cartesian differential category, but rather a Cartesian difference category. 


6.1 The Tangent Bundle Monad 


Let X be a Cartesian difference category with infinitesimal extension £ and dif- 
ference combinator ð. Define the functor T : X > X as follows: 


T(A) = Ax A T(f) = (fo To, O[f]) 
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and define the natural transformations 7 : 1g > T and u : T? > T as follows: 
na := (1a, 0) LA := (To O To, T1 © To + Too Tmi + E(7™7 © 71)) 
Proposition 5. (T, u,n) is a monad. 


Proof. Functoriality of T will follow from [C@.3] and the chain rule [C0.5]. 
Naturality of 7 and u and the monad identities will follow from the remain- 
ing difference combinator axioms. The full lengthy brute force calculations will 
appear in an upcoming extended journal version of this paper. | 


When X is a Cartesian differential category with the difference structure aris- 
ing from setting £ = 0, this tangent bundle monad coincides with the standard 
tangent monad corresponding to its tangent category structure [7, 15]. 


6.2 The Kleisli Category of T 


Recall that the Kleisli category of the monad (T, u, n) is defined as the category 
Xr whose objects are the objects of X, and where a map A —> B in Xr is a map 
f: A — T(B) in X, which would be a pair f = (fo, fi) where fj : A > B. 
The identity map in Xr is the monad unit 74 : A —> T(A), while composition 
of Kleisli maps f : A > T(B) and g : B — T(C) is defined as the composite 
co T(g)o f. To distinguish between composition in X and Xy, we denote Kleisli 
composition as go! f = uc o T(g)o f. If f = (fo, fi) and g = (go, 91), then their 
Kleisli composition can be explicitly computed out to be: 


go" f = (90,91) ©" (fo, F1) = (go © fo, Algo] o (fo, fi) + 91 © (fo + e(fi))) 


Kleisli maps can be understood as “generalized” vector fields. Indeed, T(A) 
should be thought of as the tangent bundle over A, and therefore a vector field 
would be a map (1, f) : A > T(A), which is of course also a Kleisli map. For 
more details on the intuition behind this Kleisli category see [7]. We now wish 
to explain how the Kleisli category is again a Cartesian difference category. 

We begin by exhibiting the Cartesian left additive structure of the Kleisli 
category. The product of objects in Xy is defined as A x B with projections 
nm) : Ax B+ T(A) and mi : Ax B+ T(B) defined respectively as ma = (To, 0) 
and m] = (7,0). The pairing of Kleisli maps f = (fo, f1) and g = (,90,91) is 
defined as (f, g)" = ((fo, go), (f1,91)). The terminal object is again T and where 
the unique map to the terminal object is !) = 0. The sum of Kleisli maps f Kleisli 
maps f = (fo, f1) and g = (, go, 91) is defined as f+'g = f+g = (fot+go, fi +91), 
and the zero Kleisli maps is simply 0' = 0 = (0,0). Therefore we conclude that 
the Kleisli category of the tangent monad is a Cartesian left additive category. 


Lemma 10. Xr is a Cartesian left additive category. 


The infinitesimal extension £! for the Kleisli category is defined as follows 
for a Kleisli map f = (fo, fiX: 


e'(f) = (0, fo + e(f1)) 
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Lemma 11. £c! is an infinitesimal extension on Xr. 


It is interesting to point out that for an object A the induced action 9} can 
be computed out to be: 


SA = Tmo +1 e' (m1) = (T0, 0) + (0, m1) = (T0, T1) = lta) 


and we stress that this is the identity of T(A) in the base category X (but not 
in the Kleisli category). 

To define the difference combinator for the Kleisli category, first note that 
difference combinators by definition do not change the codomain. That is, if 
f: A-— T(B) is a Kleisli arrow, then the type of its derivative qua Kleisli arrow 
should be A x A + T(B) x T(B), which coincides with the type of its derivative 
in X. Therefore, the difference combinator ð! for the Kleisli category can be 
defined to be the difference combinator of the base category, that is, for a Kleisli 
map f = (fo, fi): 

8" [f] = O[f] = (Alfo), ƏLA] 


Proposition 6. For a Cartesian difference category X, the Kleisli category Xt 
is a Cartesian difference category with infinitesimal extension e! and difference 
combinator 8". 


Proof. The full lengthy brute force calculations will appear in an upcoming ex- 
tended journal version of this paper. We do note that a crucial identity for this 
proof is that for any map f in X, the following equality holds: 


T(O[f]) = Ə [T(F)] © (To X mo, Tı X T1) 


This helps simplify many of the calculations for the difference combinator axioms 
since T(O[f]) appears everywhere due to the definition of Kleisli composition. W 


As a result, the Kleisli category of a Cartesian difference category is again a 
Cartesian difference category, whose infinitesimal extension is neither the iden- 
tity or the zero map. This allows one to build numerous examples of interesting 
and exotic Cartesian difference categories, such as the Kleisli category of Carte- 
sian differential categories (or iterating this process, taking the Kleisli category 
of the Kleisli category). We highlight the importance of this construction in the 
Cartesian differential case as it does not in general result in a Cartesian differ- 
ential category. Indeed, even if e = 0, it is always the case that e" 4 0. We 
conclude this section by taking a look at the linear maps and the ¢!-linear maps 
in the Kleisli category. A Kleisli map f = (fo, fı) is linear in the Kleisli category 
if O'[f] = fo" x], which amounts to requiring that: 


(O[fo], O[f:]) a (foo m, fi o 711) 


Therefore a Kleisli map is linear in the Kleisli category if and only if it is the 
pairing of maps which are linear in the base category. On the other hand, f is 
e'-linear if e” (f) = (0, fo + e( f1)) is linear in the Kleisli category, which in this 
case amounts to requiring that fo + ¢(f,) is linear. Therefore, if fo is linear and 
fı is e-linear, then f is e'-linear. 
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7 Conclusions and Future Work 


We have presented Cartesian difference categories, which generalize Cartesian 
differential categories to account for more discrete definitions of derivatives while 
providing an additional structure that is absent in change action models. We have 
also exhibited important examples and shown that Cartesian difference cate- 
gories arise quite naturally from considering tangent bundles in any Cartesian 
differential category. We claim that Cartesian difference categories can facilitate 
the exploration of differentiation in discrete spaces, by generalizing techniques 
and ideas from the study of their differential counterparts. For example, Carte- 
sian differential categories can be extended to allow objects whose tangent space 
is not necessarily isomorphic to the object itself [9]. The same generalization 
could be applied to Cartesian difference categories — with some caveats: for ex- 
ample, the equation defining a linear map (Definition 10) becomes ill-typed, but 
the notion of ¢-linear map remains meaningful. 

Another relevant path to consider is developing the analogue of the “tensor” 
story for Cartesian difference categories. Indeed, an important source of exam- 
ples of Cartesian differential categories are the coKleisli categories of a tensor 
differential category [3,4]. A similar result likely holds for a hypothetical “ten- 
sor difference category”, but it is not clear how these should be defined: [C8.2] 
implies that derivatives in the difference sense are non-linear and therefore their 
interplay with the tensor structure will be much different. 

A further generalization of Cartesian differential categories, categories with 
tangent structure [7] are defined directly in terms of a tangent bundle functor 
rather than requiring that every tangent bundle be trivial (that is, in a tangent 
category it may not be the case that TA = A x A). Some preliminary research 
on change actions has already shown that, when generalized in this way, change 
actions are precisely internal categories, but the consequences of this for change 
action models (and, a fortiori, Cartesian difference categories) are not under- 
stood. More recently, some work has emerged about differential equations using 
the language of tangent categories [8]. We believe similar techniques can be ap- 
plied in a straightforward way to Cartesian difference categories, where they 
might be of use to give an abstract formalization of discrete dynamical systems 
and difference equations. 

An important open question is whether Cartesian difference categories (or a 
similar notion) admit an internal language. It is well-known that the differen- 
tial A-calculus can be interpreted in Cartesian closed differential categories [14]. 
Given their similarities, we believe there will be a very similar “difference A- 
calculus” which could potentially have applications to automatic differentiation 
(change structures, a notion similar to change actions, have already been pro- 
posed as models of forward-mode automatic differentiation [12], although work 
on the area seems to have stagnated). 

Lastly, we should mention that there are adjunctions between the categories 
of Cartesian difference categories, change action models, and Cartesian differ- 
ential categories given by Proposition 1, 2, 3, and 4. These adjunctions will be 
explored in detail in the upcoming journal version of this paper. 


Cartesian Difference Categories 75 


References 


12. 


13. 


14. 


15. 


16. 


IT: 


18. 


19. 


Alvarez-Picallo, M., Eyers-Taylor, A., Jones, M.P., Ong, C.H.L.: Fixing incremental 
computation. In: European Symposium on Programming. pp. 525-552. Springer 
(2019) 


. Alvarez-Picallo, M., Ong, C.H.L.: Change actions: models of generalised differ- 


entiation. In: International Conference on Foundations of Software Science and 
Computation Structures. pp. 45-61. Springer (2019) 

Blute, R.F., Cockett, J.R.B., Seely, R.A.G.: Differential categories. Mathematical 
structures in computer science 16(06), 1049-1083 (2006) 

Blute, R.F., Cockett, J.R.B., Seely, R.A.G.: Cartesian differential categories. The- 
ory and Applications of Categories 22(23), 622-672 (2009) 

Bradet-Legris, J., Reid, H.: Differential forms in non-linear cartesian differential 
categories (2018), Foundational Methods in Computer Science 

Cai, Y., Giarrusso, P.G., Rendel, T., Ostermann, K.: A theory of changes for higher- 
order languages: Incrementalizing \-calculi by static differentiation. In: ACM SIG- 
PLAN Notices. vol. 49, pp. 145-155. ACM (2014) 

Cockett, J.R.B., Cruttwell, G.S.H.: Differential structure, tangent structure, and 
sdg. Applied Categorical Structures 22(2), 331—417 (2014) 

Cockett, J., Cruttwell, G.: Connections in tangent categories. Theory and Appli- 
cations of Categories 32(26), 835-888 (2017) 

Cruttwell, G.S.: Cartesian differential categories revisited. Mathematical Struc- 
tures in Computer Science 27(1), 70-91 (2017) 


. Ehrhard, T., Regnier, L.: The differential lambda-calculus. Theoretical Computer 


Science 309(1), 1—41 (2003) 


. Ehrhard, T.: An introduction to differential linear logic: proof-nets, models and 


antiderivatives. Mathematical Structures in Computer Science 28(7), 995-1060 
(2018) 

Kelly, R., Pearlmutter, B.A., Siskind, J.M.: Evolving the incremental {\lambda} 
calculus into a model of forward automatic differentiation (ad). arXiv preprint 
arXiv:1611.03429 (2016) 

Kock, A.: Synthetic differential geometry, vol. 333. Cambridge University Press 
(2006) 

Manzonetto, G.: What is a categorical model of the differential and the resource 
A-calculi? Mathematical Structures in Computer Science 22(3), 451-520 (2012) 
Manzyuk, O.: Tangent bundles in differential lambda-categories. arXiv preprint 
arXiv:1202.0411 (2012) 

Richardson, C.H.: An introduction to the calculus of finite differences. Van Nos- 
trand (1954) 

Sprunger, D., Jacobs, B.: The differential calculus of causal functions. arXiv 
preprint arXiv:1904.10611 (2019) 

Sprunger, D., Katsumata, S.y.: Differentiable causal computations via delayed 
trace. In: 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science 
(LICS). pp. 1-12. IEEE (2019) 

Steinbach, B., Posthoff, C.: Boolean differential calculus. In: Logic Functions and 
Equations, pp. 75-103. Springer (2009) 


76 M. Alvarez-Picallo and J.-S. P. Lemay 


Open Access This chapter is licensed under the terms of the Creative Commons 
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), 
which permits use, sharing, adaptation, distribution and reproduction in any medium 
or format, as long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if changes were 
made. 

The images or other third party material in this chapter are included in the chapter’s 
Creative Commons license, unless indicated otherwise in a credit line to the material. If 
material is not included in the chapter’s Creative Commons license and your intended 
use is not permitted by statutory regulation or exceeds the permitted use, you will need 
to obtain permission directly from the copyright holder. 


® 


Check for 
updates 


Contextual Equivalence for Signal Flow Graphs 


Filippo Bonchi!, Robin Piedeleu2*, Paweł Sobociriski?**, and 
Fabio Zanasi?* (=) 


1 Universita di Pisa, Italy 
? University College London, UK, {r.piedeleu, f.zanasi}@ucl.ac.uk 
3 Tallinn University of Technology, Estonia 


Abstract. We extend the signal flow calculus—a compositional account 
of the classical signal flow graph model of computation—to encompass 
affine behaviour, and furnish it with a novel operational semantics. The 
increased expressive power allows us to define a canonical notion of con- 
textual equivalence, which we show to coincide with denotational equal- 
ity. Finally, we characterise the realisable fragment of the calculus: those 
terms that express the computations of (affine) signal flow graphs. 


Keywords: signal flow graphs - affine relations - full abstraction - con- 
textual equivalence - string diagrams 


1 Introduction 


Compositional accounts of models of computation often lead one to consider 
relational models because a decomposition of an input-output system might 
consist of internal parts where flow and causality are not always easy to assign. 
These insights led Willems [83] to introduce a new current of control theory, 
called behavioural control: roughly speaking, behaviours and observations are of 
prime concern, notions such as state, inputs or outputs are secondary. Indepen- 
dently, programming language theory converged on similar ideas, with contextual 
equivalence {25]28] often considered as the equivalence: programs are judged to 
be different if we can find some context in which one behaves differently from 
the other, and what is observed about “behaviour” is often something quite 
canonical and simple, such as termination. Hoare [I7] and Milner [23] discovered 
that these programming language theory innovations also bore fruit in the non- 
deterministic context of concurrency. Here again, research converged on studying 
simple and canonical contextual equivalences [24[18]. 

This paper brings together all of the above threads. The model of computa- 
tion of interest for us is that of signal flow graphs [82/21], which are feedback 
systems well known in control theory [2I] and widely used in the modelling of 
linear dynamical systems (in continuous time) and signal processing circuits (in 
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discrete time). The signal flow calculus [IOIO] is a syntactic presentation with 
an underlying compositional denotational semantics in terms of linear relations. 
Armed with string diagrams as a syntax, the tools and concepts of program- 
ming language theory and concurrency theory can be put to work and the cal- 
culus can be equipped with a structural operational semantics. However, while 
in previous work [9] a connection was made between operational equivalence 
(essentially trace equivalence) and denotational equality, the signal flow calculus 
was not quite expressive enough for contextual equivalence to be a useful notion. 

The crucial step turns out to be moving from linear relations to affine rela- 
tions, i.e. linear subspaces translated by a vector. In recent work [6], we showed 
that they can be used to study important physical phenomena, such as current 
and voltage sources in electrical engineering, as well as fundamental synchroni- 
sation primitives in concurrency, such as mutual exclusion. Here we show that, 
in addition to yielding compelling mathematical domains, affinity proves to be 
the magic ingredient that ties the different components of the story of signal flow 
graphs together: it provides us with a canonical and simple notion of observation 
to use for the definition of contextual equivalence, and gives us the expressive 
power to prove a bona fide full abstraction result that relates contextual equiv- 
alence with denotational equality. 

To obtain the above result, we extend the signal flow calculus to handle affine 
behaviour. While the denotational semantics and axiomatic theory appeared 
in [6], the operational account appears here for the first time and requires some 
technical innovations: instead of traces, we consider trajectories, which are infi- 
nite traces that may start in the past. To record the time, states of our transition 
system have a runtime environment that keeps track of the global clock. 

Because the affine signal flow calculus is oblivious to flow directionality, some 
terms exhibit pathological operational behaviour. We illustrate these phenomena 
with several examples. Nevertheless, for the linear sub-calculus, it is known [9] 
that every term is denotationally equal to an executable realisation: one that 
is in a form where a consistent flow can be identified, like the classical notion 
of signal flow graph. We show that the question has a more subtle answer in 
the affine extension: not all terms are realisable as (affine) signal flow graphs. 
However, we are able to characterise the class of diagrams for which this is true. 


Related work. Several authors studied signal flow graphs by exploiting concepts 
and techniques of programming language semantics, see e.g. [4J22)292]. The most 
relevant for this paper is [2], which, independently from [10], proposed the same 
syntax and axiomatisation for the ordinary signal flow calculus and shares with 
our contribution the same methodology: the use of string diagrams as a math- 
ematical playground for the compositional study of different sorts of systems. 
The idea is common to diverse, cross-disciplinary research programmes, includ- 
ing Categorical Quantum Mechanics [I[11J12], Categorical Network Theory [3], 
Monoidal Computer and the analysis of (a)synchronous circuits [14J15}. 


Outline In Section [2|we recall the affine signal flow calculus. Section [3]introduces 
the operational semantics for the calculus. Section [4] defines contextual equiv- 
alence and proves full abstraction. Section [5] introduces a well-behaved class of 
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circuits, that denotes functional input-output systems, laying the groundwork 
for Section [6] in which the concept of realisability is introduced before a charac- 
terisation of which circuit diagrams are realisable. Missing proofs can be found 
in the extended version of this paper [7]. 


2 Background: the Affine Signal Flow Calculus 


The Affine Signal Flow Calculus extends the signal flow calculus [9] with an 
extra generator — that allows to express affine relations. In this section, we 
first recall its syntax and denotational semantics from [6] and then we highlight 
two key properties for proving full abstraction that are enabled by the affine 
extension. The operational semantics is delayed to the next section. 


2:02) —e:0,0) Aan FBR: D:a) o: (0,1) e: (0,1) 


D2) e: 0,1) GH: anD -h:a —C:4,2) —o: (1,0) —: (1,0) 


e:(n,z) d:(z,m) ciam) d:(r,z) 


l _+(0,0) —: (1,1) DS: (2,2) c;d:(n,m) c@d: (n+r,m+z) 


Fig. 1. Sort inference rules. 


2.1 Syntax 
CaS Se) |e | Se | | = | | (1) 
| > ee ee bee | ee | (2) 
Lat | — IX | @@e | ese (3) 


The syntax of the calculus, generated by the grammar above, is parametrised 
over a given field k, with k ranging over k. We refer to the constants in rows q- 
(2) as generators. Terms are constructed from generators, | < —, >, and the 
two binary operations in (3). We will only consider those terms that are sortable, 
i.e. they can be associated with a pair (n, m), with n,m € N. Sortable terms are 
called circuits: intuitively, a circuit with sort (n, m) has n ports on the left and 
m on the right. The sorting discipline is given in Fig. |1| We delay discussion of 
computational intuitions to Section [3] but, for the time being, we observe that 
the generators of row are those of row “reflected about the y-axis”. 


2.2 String Diagrams 
It is convenient to consider circuits as the arrows of a symmetric monoidal cat- 
egory ACirc (for Affine Circuits). Objects of ACirc are natural numbers (thus 
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ACirc is a prop [I9]) and morphisms n —> m are the circuits of sort (n, m), 
quotiented by the laws of symmetric monoidal categories oB] The circuit 
grammar yields the symmetric monoidal structure of ACirc: sequential composi- 
tion is given by c;d, the monoidal product is given by c @ d, and identities and 
symmetries are built by pasting together — and >< in the obvious way. We will 
adopt the usual convention of writing morphisms of ACirc as string diagrams, 


; ) = = . ck: 
meaning that c;c is drawn Jc He fF andc@d is drawn === . More suc- 
yer 


cinctly, ACirc is the free prop on generators (1)-(2). The free prop on (1)-(2) sans 
— and —, hereafter called Circ, is the signal flow calculus from [9] 


Example 1. The diagram represents the circuit 


((e—; —€_)®—) ; (—8(>—; —_)) ; (8 4®-)®— ) ; (> ; -#)4 


2.3 Denotational Semantics and Axiomatisation 
The semantics of circuits can be given denotationally by means of affine relations. 


Definition 1. Let k be a field. An affine subspace of k? is a subset V C k? that 
is either empty or for which there exists a vector a € k? and a linear subspace 
L of k? such that V = {a +v | v € L}. A k-affine relation of type n > m is an 
affine subspace of k” x k™, considered as a k-vector space. 


Note that every linear subspace is affine, taking a above to be the zero vector. 
Affine relations can be organised into a prop: 


Definition 2. Let k be a field. Let ARel, be the following prop: 


— arrows n —> m are k-affine relations. 

— composition is relational: given G = {(u,v)|u € k",v € k™} and H = 
{(v,w)|v € k™,w € k'}, their composition is G; H := {(u, w) | 3v. (u, v) € 
GA (w, w) € H}. 


— monoidal product given by GBH = {( (i) : (o) | (u,v) € G, (u',v') € n}. 


In order to give semantics to ACirc, we use the prop of affine relations over 
the field k(x) of fractions of polynomials in x with coefficients from k. Elements 
5 n 1 2 n 
q € k(x) are a fractions kappa Eka tt hae for some n,m € N and ky, l; € k. 
Sum, product, 0 and 1 in k(x) are defined as usual. 


4 This quotient is harmless: both the denotational semantics from [6] and the opera- 
tional semantics we introduce in this paper satisfy those axioms on the nose. 
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Definition 3. The prop morphism |-]: ACirc + ARel,.., is inductively defined 
on circuits as follows. For the generators in 


cs f(n E)r} >= {(Pho+a) nackte} 


—e — {(p,¢) |p E€ k(x)} o— ++ {(0,0)} — > {(0, 1)} 


D +> {(pp-r)|pEek@} A +> {(y,p-2)|pek(z)} 
) 


where e is the only element of k(x)°. The semantics of components in is 
symmetric, e.g. e— is mapped to {(p,e) | p € k(x)}. For 


— > {(p,p)|pEk(z)} X —> {((2).(2)) |pae ke) 


> {(e,0)} aeo +> [ei] e [co] c13c2 > [ea]; [ca] 


The reader can easily check that the pair of 1-dimensional vectors (1 i) E€ 


71-2 
k(x)! x k(x)! belongs to the denotation of the circuit in Example [1] 

The denotational semantics enjoys a sound and complete axiomatisation. 
The axioms involve only basic interactions between the generators (2)-@). The 
resulting theory is that of Affine Interacting Hopf Algebras (alH).The generators 
in form a Hopf algebra, those in form another Hopf algebra, and the 
interaction of the two give rise to two Frobenius algebras. We refer the reader 
to [6] for the full set of equations and all further details. 


Proposition 1. For all c,d in ACirc, |c] = [d] if and only if c Fa. 


2.4 Affine vs Linear Circuits 


It is important to highlight the differences between ACirc and Circ. The latter 
is the purely linear fragment: circuit diagrams of Circ denote exactly the linear 
relations over k(x) [8], while those of ACirc denote the affine relations over k(x). 

The additional expressivity afforded by affine circuits is essential for our 
development. One crucial property is that every polynomial fraction can be 
expressed as an affine circuit of sort (0, 1). 


Lemma 1. For all p € k(x), there is cp E€ ACirc[0, 1] with [cp] = {(e,p)}. 


Proof. For each p € k(x), let P be the linear subspace generated by the pair of 
1-dimensional vectors (1, p). By fullness of the denotational semantics of Circ [8], 
there exists a circuit c in Circ such that |c] = P. Then, [— ;c] = {(e,p)}. 


The above observation yields the following: 


Proposition 2. Let (u,v) € k(x)” x k(x)™. There exist circuits Cu, E€ ACirc[0, n] 
and Cy € ACirc[m, 0] such that [cu] = {(e,u)} and [co] = {(v, )}. 
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circuit cp, such that [c,,] = {(¢,p;)}. Let cu = cp, ®...@cp,. Then [cu] = 
{(e, u)}. For c», it is enough to see that Proposition [1] also holds with 0 and 1 
switched, then use the argument above. 


qm 


PL 4i 
Proof. Let u = ( : ) and v = ( : ) By Lemma |1| for each p;, there exists a 


Proposition 2Jasserts that any behaviour (u, v) occurring in the denotation of 
some circuit c, i.e., such that (u,v) € fc], can be expressed by a pair of circuits 
(Cu, Cy). We will, in due course, think of such a pair as a context, namely an 
environment with which a circuit can interact. Observe that this is not possible 
with the linear fragment Circ, since the only singleton linear subspace is 0. 

Another difference between linear and affine concerns circuits of sort (0, 0). 
Indeed k(x)? = {e}, and the only linear relation over k(x)? xk(z)° is the singleton 
{(e,e)}, which is ido in ARel,,,). But there is another affine relation, namely the 
empty relation Ú € k(x)? x k(a)°. This can be represented by +o, for instance, 


since [o] = {(e, 1)}; {(0, e)} = 0. 


Proposition 3. Let c € ACirc[0,0]. Then [c] is either ido or 0. 


3 Operational Semantics for Affine Circuits 


Here we give the structural operational semantics of affine circuits, building on 
previous work [9] that considered only the core linear fragment, Circ. We consider 
circuits to be programs that have an observable behaviour. Observations are 
possible interactions at the circuit’s interface. Since there are two interfaces: a 
left and a right, each transition has two labels. 


In a transition t> c => t >ò d ,cand č are states, that is, circuits 
augmented with information about which values k € k are stored in each regis- 
ter (-{)— and —@}-) at that instant of the computation. When transitioning 
to c’, the v above the arrow is a vector of values with which c synchronises on the 
left, and the w below the arrow accounts for the synchronisation on the right. 
States are decorated with runtime contexts: t and t’ are (possibly negative) inte- 
gers that—intuitively—indicate the time when the transition happens. Indeed, 
in Fig. |2| every rule advances time by 1 unit. “Negative time” is important: as 
we shall see in Example | some executions must start in the past. 

The rules in the top section of Fig. [2|provide the semantics for the generators 
in (ip: —«_ is a copier, duplicating the signal arriving on the left; —e accepts 
any signal on the left and discards it, producing nothing on the right; “>— is 
an adder that takes two signals on the left and emits their sum on the right, 
o— emits the constant 0 signal on the right; -{k)}- is an amplifier, multiplying 
the signal on the left by the scalar k € k. All the generators described so far 


l 
are stateless. State is provided by —{2)— which is a register; a synchronous one 
place buffer with the value l stored. When it receives some value k on the left, 
it emits l on the right and stores k. The behaviour of the affine generator t— 
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tè —< > t4+1l eo = t>—e => t+1p—e 
tè p> | => | t+1e p tbo “+ t+lbo 
l k k l 

t > r) > t+1 > r) t > —{r) mA ale SBD 
Ot > lb bf > foe (t £ 0) 
tp» => t+lo> tbe “> t+1lbe 
tè —{ ttle —{ tp—o > rS 
L t k rl 

to —@] 7 fale qa ée U Hea (ole —@E 
(a tp e> t+1p (t £0) 
: et. (ELE) (SE . 

ie — Bea o TI SC e ë gee mA ie 
tbe|—>|t+1lope tpd|—>|t+1lod 


to c;d > t+1p d;d 


dd — Ee to d| |t+1o d 


tè cd T> oai e & @el 


Fig. 2. Structural rules for operational semantics, with p € Z, k,l ranging over k and 
u,v, w vectors of elements of k of the appropriate size. The only vector of k? is written 
as e (as in Definition [3), while a vector (kı ... kn)” € k” as ki... kn. 


depends on the time: when t = 0, it emits 1, otherwise it emits 0. Observe that 
the behaviour of all other generators is time-independent. 


So far, we described the behaviour of the components in using the in- 
tuition that signal flows from left to right: in a transition —> , the signal v on 
the left is thought as trigger and w as effect. For the generators in B), whose 
behaviour is defined by the rules in the second section of Fig. |2| the behaviour 
is symmetric—indeed, here it is helpful to think of signals as flowing from right 
to left. The next section of Fig. [2] specifies the behaviours of the structural con- 
nectors of (3): >< is a twist, swapping two signals, | | is the empty circuit 
and —— is the identity wire: the signals on the left and on the right ports are 
equal. Finally, the rule for sequential ; composition forces the two components to 
have the same value v on the shared interface, while for parallel 6 composition, 
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components can proceed independently. Observe that both forms of composition 
require component transitions to happen at the same time. 


Definition 4. Let c € ACirc. The initial state co of c is the one where all the 
registers store 0. A computation of c starting at time t < 0 is a (possibly infinite) 
sequence of transitions 


A — Gas... Bae... (4) 


Since all transitions increment the time by 1, it suffices to record the time at 
which a computation starts. As a result, to simplify notation, we will omit the 
runtime context after the first transition and, instead of (4, write 

vt vt+1 E42 


te co wy? El Di Oun 


Example 2. The circuit in Example [I] can perform the following computation. 


In the example above, the flow has a clear left-to-right orientation, albeit 
with a feedback loop. For arbitrary circuits of ACirc this is not always the case, 
which sometimes results in unexpected operational behaviour. 


Example 3. In H@} is not possible to identify a consistent flow: H goes from 
left to right, while ~@} from right to left. Observe that there is no computation 
starting at t = 0, since in the initial state the register contains 0 while — must 
emit 1. There is, however, a (unique!) computation starting at time t = —1, that 
loads the register with 1 before — can also emit 1 at time t = 0. 


0 . 1 e 0 e 0 e 
lè e H ` {eF Dees HGH >: 


Similarly, +@}@}- features a unique computation starting at time t = —2. 


Oo 0 . Oo 41 3 1 O : 0 O 7 
2> G He F > Hee > Hee > ees > o 


It is worthwhile clarifying the reason why, in the affine calculus, some compu- 
tations start in the past. As we have already mentioned, in the linear fragment 
the semantics of all generators is time-independent. It follows easily that time- 
independence is a property enjoyed by all purely linear circuits. The behaviour 
of —, however, enforces a particular action to occur at time 0. Considering this 


in conjunction with a right-to-left register results in +@}-, and the effect is to 
anticipate that action by one step to time -1, as shown in Example B] It is obvi- 
ous that this construction can be iterated, and it follows that the presence of a 
single time-dependent generator results in a calculus in which the computation 
of some terms must start at a finite, but unbounded time in the past. 
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Example 4. Another circuit with conflicting flow is o. Here there is no possible 
transition at t = 0, since at that time — must emit a 1 and —o can only synchro- 


nise on a 0. Instead, the circuit; | can always perform an infinite computation 
eS Eine S| = ..., for any t < 0. Roughly speaking, the computations of 


these two (0, 0) circuits are operational mirror images of the two possible denota- 
tions of Proposition 3] This intuition will be made formal in Section [4] For now, 
it is worth observing that for all c,; |@c can perform the same computations 
of c, while +o @ c cannot ever make a transition at time 0. 


Example 5. Consider the circuit —@H2)-, which again features conflicting flow. 
Our equational theory equates it with , but the computations involved are 
subtly different. Indeed, for any sequence a; € k, it is obvious that — admits 
the computation 


Ob “ = 2y (5) 
ag ai a2 
The circuit —@Ha2)- admits a similar computation, but we must begin at time 
ELE) P , 8 
t = —1 in order to first “load” the registers with ao: 
0 0 ö ao ao 7 ay ay é a2 ag a 
1> -CH > -CH <4 -CH => -CH >... (6) 


The circuit -—2)@}+, which again is equated with — by the equational theory, 


is more tricky. Although every computation of — can be reproduced, Aa} 
admits additional, problematic computations. Indeed, consider 


0 0 0, ol 
0 


0> DGF > De (7) 


at which point no further transition is possible—the circuit can deadlock. 


The following lemma is an easy consequence of the rules of Fig. 2Jand follows 
by structural induction. It states that all circuits can stay idle in the past. 


Lemma 2. Let c € ACirc[n, m] with initial state co. Then t> co > co ift < 0. 


3.1 Trajectories 


For the non-affine version of the signal flow calculus, we studied in [9] traces 
arising from computations. For the affine extension, this is not possible since, as 
explained above, we must also consider computations that start in the past. In 
this paper, rather than traces we adopt a common control theoretic notion. 


Definition 5. An (n,m)-trajectory o is a Z-indexed sequence o : Z > k” x k™ 
that is finite in the past, i.e., for which 3j € Z such that o(t) = (0,0) fori < j. 


By the universal property of the product we can identify o : Z => k” x k” 
with the pairing (o7,0,) of oy : Z > k” and o, : Z > k™. A (k,m)-trajectory 
c and (m,n)-trajectory T are compatible if oy = Tı. In this case, we can define 
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their composite, a (k,n)-trajectory o ; T by 037 := (01, Tr). Given an (n1,™1)- 
trajectory o1, and an (n2, m2)-trajectory o2, their product, an (nı +n2, mı +mM2)- 
a(i) 
T(i) 


we can organise sets of trajectories into a prop. 


trajectory 01 ® 02, is defined (o1 ®o2)(t) = ) . Using these two operations 


Definition 6. The composition of two sets of trajectories is defined as S;T := 
{o;r|o € S,T ET are compatible}. The product of sets of trajectories is defined 
as S1 @ S2 := {o1 Boe | o1 E S1,02 E S2}. 


Clearly both operations are strictly associative. The unit for @ is the singleton 
with the unique (0,0)-trajectory. Also ; has a two sided identity, given by sets 
of “copycat” (n, n)-trajectories. Indeed, we have that: 


Proposition 4. Sets of (n,m)-trajectories are the arrows n — m of a prop Traj 
with composition and monoidal product given as in Definition [6 


Traj serves for us as the domain for operational semantics: given a circuit c 
and an infinite computation 


te co —> C1 > C2 


its associated trajectory o is 


(0, 0) otherwise. 


Definition 7. For a circuit c, (c) is the set of trajectories given by its infinite 
computations, following the translation above. 


The assignment c> (c) is compositional, that is: 
Theorem 1. (-): ACirc > Traj is a morphism of props. 


Example 6. Consider the computations and (6) from Example [5] According 
to both are translated into the trajectory o mapping i > 0 into (a;,a;) and 
i < 0 into (0,0). The reader can easily verify that, more generally, it holds that 


(—) = (-@H= >). At this point it is worth to remark that the two circuits 
would be distinguished when looking at their traces: the trace of computation 
is different from the trace of (6). Indeed, the full abstraction result in [9] does 
not hold for all circuits, but only for those of a certain kind. The affine extension 
obliges us to consider computations that starts in the past and, in turn, this 
drives us toward a stronger full abstraction result, shown in the next section. 


Before concluding, it is important to emphasise that (—-) = ({a-@] 
also holds. Indeed, problematic computations, like (7), are all finite and, by 
definition, do not give rise to any trajectory. The reader should note that the use 
of trajectories is not a semantic device to get rid of problematic computations. 
In fact, trajectories do not appear in the statement of our full abstraction result; 
they are merely a convenient tool to prove it. Another result (Proposition p) 
independently takes care of ruling out problematic computations. 
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4 Contextual Equivalence and Full Abstraction 


This section contains the main contribution of the paper: a traditional full ab- 
straction result asserting that contextual equivalence agrees with denotational 
equivalence. It is not a coincidence that we prove this result in the affine set- 
ting: affinity plays a crucial role, both in its statement and proof. In particular, 
Proposition [3] gives us two possibilities for the denotation of (0, 0) circuits: (i) 
Ø—which, roughly speaking, means that there is a problem (see e.g. Example {4} 
and no infinite computation is possible—or (ii) idg, in which case infinite com- 
putations are possible. This provides us with a basic notion of observation, akin 
to observing termination vs non-termination in the A-calculus. 


Definition 8. For a circuit c € ACirc[0,0] we write c + if c can perform an 
infinite computation and c f otherwise. For instance: :f, while of. 


To be able to make observations about arbitrary circuits we need to intro- 
duce an appropriate notion of context. Roughly speaking, contexts for us are 
(0, 0)-circuits with a hole into which we can plug another circuit. Since ours 
is a variable-free presentation, “dangling wires” assume the role of free vari- 
ables [I6]: restricting to (0, 0) contexts is therefore analogous to considering 
ground contexts—i.e. contexts with no free variables—a standard concept of 
programming language theory. 

To define contexts formally, we extend the syntax of Section with an 


extra generator “—” of sort (n, m). A (0, 0)-circuit of this extended syntax is a 
context when “—” occurs exactly once. Given an (n, m)-circuit c and a context 
C|—], we write C[c] for the circuit obtained by replacing the unique occurrence 
of “—” by c. 


With this setup, given an (n, m)-circuit c, we can insert it into a context 
C{-] and observe the possible outcome: either Cc] t or C[c] #. This naturally 
leads us to contextual equivalence and the statement of our main result. 


Definition 9. Given c,d € ACirc[n, m], we say that they are contextually equiv- 
alent, written c = d, if for all contexts C[-], 


Cl t if Cid] t. 


Example 7. Recall from Example |5| the circuits and Hza. Take the 
context C[—] = cs; — ;cr for cg € ACirc[0, 1] and c, € ACirc[1, 0]. Assume that 
Co and c, have a single infinite computation. Call o and 7 the corresponding 
trajectories. If o = 7, both C[—] and C|-{+@}] would be able to perform 
an infinite computation. Instead if 0 Æ 7, none of them would perform any 


infinite computation: — would stop at time t, for t the first moment such that 
a(t) Æ T(t), while C|-{»+<@}-] would stop at time t + 1. 
Now take as context C|-] = e-; ;—e. In contrast to c, and c,, e— 


and —e can perform more than one single computation: at any time they can 
nondeterministically emit any value. Thus every computation of C[—] = e—e 
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can always be extended to an infinite one, forcing synchronisation of e— and 


—e at each step. For C[4—=+@}-] = e-{2)(@}*, e— and —e may emit different 
values at time t, but the computation will get stuck at t + 1. However, our 


definition of f only cares about whether C[--+@}-] can perform an infinite 
computation. Indeed it can, as long as e— and —e consistently emit the same 
value at each time step. 

If we think of contexts as tests, and say that a circuit c passes test C[—] if 
C|c| perform an infinite computation, then our notion of contextual equivalence 


is may-testing equivalence [13]. From this perspective, and 42He@F- are not 


must equivalent, since the former must pass the test e—; — ;—e while {2)@ 
may not. It is worth to remark here that the distinction between may and must 
testing will cease to make sense in Section [5] where we identify a certain class 
of circuits equipped with a proper flow directionality and thus a deterministic, 
input-output, behaviour. 


Theorem 2 (Full abstraction). c= d iff c wa 


The remainder of this section is devoted to the proof of Theorem We 


will start by clarifying the relationship between fractions of polynomials (the 
denotational domain) and trajectories (the operational domain). 


4.1 From Polynomial Fractions to Trajectories 


The missing link between polynomial fractions and trajectories are (formal) 
Laurent series: we now recall this notion. Formally, a Laurent series is a function 
ao: Z — k for which there exists j € Z such that o(¢) = 0 for all i < j. We 
write ø as...,0(—1),0(0),0(1),... with position 0 underlined, or as formal sum 
= 0(i)2". Each Laurent series o has then a degree d € Z, which is the first 
non-zero element. Laurent series form a field k((x)): sum is pointwise, product 
is by convolution, and the inverse o~! of o with degree d is defined as: 


0 ifi < -d 
i= gD fisd 9 
a cae HED), T gin iene 


Note (formal) power series, which form ‘just’ a ring k[[z]], are a particular case of 
Laurent series, namely those os for which d > 0. What is most interesting for our 
purposes is how polynomials and fractions of polynomials relate to k((a)) and 
k|[z]]. First, the ring k[2] of polynomials embeds into k([[2]], and thus into k((x)): 
a polynomial po + pix +--+: + pnz” can also be regarded as the power series 
eo pit’ with p; = 0 for all i > n. Because Laurent series are closed under 
division, this immediately gives also an embedding of the field of polynomial 
fractions k(x) into k((a)). Note that the full expressiveness of k((x)) is required: 


for instance, the fraction 4 is represented as the Laurent series ...,0,1,0,0,..., 
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which is not a power series, because a non-zero value appears before position 0. 
In fact, fractions that are expressible as power series are precisely the rational 


: : - kot+hya-+hga?--+hky 2” 
fractions, i.e. of the form sts ere E sere where lo Æ 0. 


Rational fractions form a ring k(x) which, dif- 
ferently from the full field k(x), embeds into k|[a]] -——~ k((z)) 


k|[x]]. Indeed, whenever lo # 0, the inverse of NG 

lo tha + loz? + lng” is, by (9), a bona fide k 

power series. The commutative diagram on the C7 (x) Sa 
right is a summary. kir k(x) 


Relations between k((x))-vectors organise themselves into a prop ARel,;;.)) 
(see Definition 2). There is an evident prop morphism +: ARel,.,, — ARelk«e)): 
it maps the empty affine relation on k(x) to the one on k((2)), and otherwise 
applies pointwise the embedding of k(x) into k((x)). For the next step, observe 
that trajectories are in fact rearrangements of Laurent series: each pair of vectors 
(u,v) E€ k((x))” x k((x))™, as on the left below, yields the trajectory «(u, v) 
defined for all ¿ € Z as on the right below. 


at 6’ a‘ (i) B*(i) 
“mosii pi: K(u, v)(i) = pp: 
a” pm a” (i) B™ (i) 
Similarly to ų¿, the assignment « extends to sets of vectors, and also to a prop 


morphism from ARel,;;.)) to Traj. Together, x and ųı provide the desired link 
between operational and denotational semantics. 


Theorem 3. (-) = 010 [-] 


Proof. Since both are symmetric monoidal functors from a free prop, it is enough 
to check the statement for the generators of ACirc. We show, as an example, the 


case of —«_. By Definition [<] = {( ()) |pe K(a)}. This is mapped 


by ų to f(a (*)) lac KED}. Now, to see that «(v([—€])) = (-«C), it is 
enough to observe that a trajectory ø is in «(4([—«_])) precisely when, for all 
ki 
ki 


i, there exists some k; € k such that o(i) = (i 


4.2 Proof of Full Abstraction 


We now have the ingredients to prove T heorem [| First, we prove an adequacy 
result for (0, 0) circuits. 


Proposition 5. Let c € ACirc[0,0]. Then |c] = ido if and only if c f. 


Proof. By Proposition |3| either |c] = ido or [c] = Ø, which, combined with 
Theorem [3] means that (c) = « o (ido) or (c) = «x o 4(0). By definition of ų this 
implies that either (c) contains a trajectory or not. In the first case c f; in the 
second c f. 
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Next we obtain a result that relates denotational equality in all contexts to 
equality in alH. Note that it is not trivial: since we consider ground contexts 
it does not make sense to merely consider “identity” contexts. Instead, it is at 
this point that we make another crucial use of affinity, taking advantage of the 
increased expressivity of affine circuits, as showcased by Proposition 


alH 


Proposition 6. If [C[c]] = [C[d]] for all conterts C|-], then c = d. 


Proof. Suppose that c a d. Then fe] # [d]. Since both [c] and [d] are affine 
relations over k(x), there exists a pair of vectors (u,v) € k(x)” x k(x)™ that is in 
one of |c] and [d], but not both. Assume wlog that (u,v) € [c] and (u,v) ¢ [d]. 
By Proposition [2] there exists c, and cy such that [cu ;¢; cv] = [cu]; lcl; [ev] = 
{(e,u)}; [c]; {(v, ¢)}. Since (u,v) € [c], then [cu 3c; cu] = {(¢, e)}. Instead, since 
(u,v) ¢ [d], we have that [cu ;d;cy] = Ø. Therefore, for the context C[—] = 
Cu; — 3Cy, we have that [C[c]] 4 [CId]. 


The proof of our main result is now straightforward. 


Proof of Theorem|2| Let us first suppose that c AH J. Then [C[c]] = [C[d]] for 
all contexts C[|-], since [-] is a morphism of props. By Corollary [5] it follows 
immediately that C[c] t if and only if Cid] t, namely c = d. 

Conversely, suppose that, for all C[—], Cc] t iff C[d] +. Again by Corollary 
we have that [C[c]] = [C[d]]. We conclude by invoking Proposition [6] 


5 Functional Behaviour and Signal Flow Graphs 


There is a sub-prop SF of Circ of classical signal flow graphs (see e.g. (21]). Here 
signal flows left-to-right, possibly featuring feedback loops, provided that these 
go through at least one register. Feedback can be captured algebraically via an 
operation Tr(-): Circ[n + 1,m + 1] > Circ[n, m] taking c: n+ 1—>4 m-+1 to: 


Following [9], let us call Ciré the free sub-prop of Circ of circuits built from 
and the generators of (ip, without —. Then SF is defined as the closure of Circ 
under Tr(-). For instance, the circuit of Example [2] is in SF. 

Signal flow graphs are intimately connected to the executability of circuits. In 
general, the rules of Figure [2] do not assume a fixed flow orientation. As a result, 
some circuits in Circ are not executable as functional input-output nee as 


we have demonstrated with +@}+, to and -{s)~@} of Examples 345 Notice 
that none of these are signal flow graphs. In fact, the circuits of SF do not have 
pathological behaviour, as we shall state more precisely in Proposition [9] 

At the denotational level, signal flow graphs correspond precisely to rational 
functional behaviours, that is, matrices whose coefficients are in the ring k(x) 
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of rational fractions (see Section [4.1). We call such matrices, rational matrices. 
One may check that the semantics of a signal flow graph c: (n, m) is always 
of the form [c] = {(v, A- v) | v € k(x)”}, for some m x n rational matrix A. 
Conversely, all relations that are the graph of rational matrices can be expressed 
as signal flow graphs. 


Proposition 7. Given c: (n,m), we have |c] = {(p, A - p) | p E€ k(x)"} for 
some rational mx n matrix A iff there exists a signal flow graph f, i.e., a circuit 


f: (n, m) of SF, such that [f] = [c]. 


Proof. This is a folklore result in control theory which can be found in [30]. The 
details of the translation between rational matrices and circuits of SF can be 
found in [I0] Section 7]. 


The following gives an alternative characterisation of rational matrices—and 
therefore, by Proposition [7] of the behaviour of signal flow graphs—that clarifies 
their role as realisations of circuits. 


Proposition 8. An m x n matriz is rational iff A-r € k(x)™ for allr € k(x)”. 


Proposition [8] is another guarantee of good behaviour—it justifies the name 
of inputs (resp. outputs) for the left (resp. right) ports of signal flow graphs. 
Recall from Section [4.1] that rational fractions can be mapped to Laurent series 
of nonnegative degree, i.e., to plain power series. Operationally, these correspond 
to trajectories that start after t = 0. Proposition[§]guarantees that any trajectory 
of a signal flow graph whose first nonzero value on the left appears at time t = 0, 
will not have nonzero values on the right starting before time t = 0. In other 
words, signal flow graphs can be seen as processing a stream of values from left to 
right. As a result, their ports can be clearly partitioned into inputs and outputs. 

But the circuits of SF are too restrictive for our purposes. For example, 


can also be seen to realise a functional behaviour transforming inputs 


— r 

on the left into outputs on the right yet it is not in SF. Its behaviour is no 
longer linear, but affine. Hence, we need to extend signal flow graphs to include 
functional affine behaviour. The following definition does just that. 


Definition 10. Let ASF be the sub-prop of ACirc obtained from all the genera- 
tors in (ip, closed under Tr(-). Its circuits are called affine signal flow graphs. 


As before, none of +@}, Ho and -[2)-@}- from Examples|3)5|are affine sig- 
nal flow graphs. In fact, ASF rules out pathological behaviour: all computations 
can be extended to be infinite, or in other words, do not get stuck. 


Proposition 9. Given an affine signal flow graph f, for every computation 


tp ph ee 


Ut 
vp+1 


there exists a trajectory o € (c) such that o(i) = (u; v;i) fort <i<t+n. 


Proof. By induction on the structure of affine signal flow graphs. 
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If SF circuits correspond precisely to k(x)-matrices, those of ASF correspond 
precisely to k(z)-affine transformations. 


Definition 11. A map f: k(x)” — k(x)™ is an affine map if there exists an 
mxn matriz A and b E€ k(a)™ such that f(p) = A-p+b for all p € k(x)”. We 
call the pair (A,b) the representation of f. 


The notion of rational affine map is a straightforward extension of the linear 
case and so is the characterisation in terms of rational input-output behaviour. 


Definition 12. An affine map f: pW» A-p+b is rational if A and b have 
coefficients in k(x). 


Proposition 10. An affine map f: k(x)" > k(x)™ is rational iff f(r) € k(x)™ 
for allr € k(x)”. 


The following extends the correspondence of Proposition[7] showing that ASF 
is the rightful affine heir of SF. 


Proposition 11. Given c: (n, m), we have |c] = {(p, f(p)) | p E€ k(x)"} for 
some rational affine map f iff there exists an affine signal flow graph g, i.e., a 
circuit g: (n, m) of ASF, such that [g] = [ec]. 


Proof. Let f be given by p |> Ap + b for some rational m x n matrix A and 
vector b € k(x)™. By Proposition [7] we can find a circuit c4 of SF such that 


[ca] = {(p,A - p) | p € k(x)}. Similarly, we can n lea 
represent b as a signal flow graph c of sort (1, m). , _ ma 
Then, the circuit on the right is clearly in ASF and |` 7 = 
verifies [c] = {(p, Ap + b) | p € k(x)} as required. m 


L 
For the converse direction it is straightforward to check by structural in- 
duction that the denotation of affine signal flow graphs is the graph (in the 
set-theoretic sense of pairs of values) of some rational affine map. 


6 Realisability 


In the previous section we gave a restricted class of morphisms with good be- 
havioural properties. We may wonder how much of ACirc we can capture with 
this restricted class. The answer is, in a precise sense: most of it. 

Surprisingly, the behaviours realisable in Circ—the purely linear fragment— 

are not more expressive. In fact, from an operational (or denotational, by full 
abstraction) point of view, Circ is nothing more than jumbled up version of SF. 
Indeed, it turns out that Circ enjoys a realisability theorem: any circuit c of Circ 
can be associated with one of SF, that implements or realises the behaviour of c 
into an executable form. 
But the corresponding realisation may not flow neatly 
from left to right like signal flow graphs do—its inputs 
and outputs may have been moved from one side to the 
other. Consider for example, the circuit on the right 
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It does not belong to SF but it can be read as a signal flow graph with an input 
that has been bent and moved to the bottom right. The behaviour it realises 
can therefore executed by rewiring this port to obtain a signal flow graph: 


alH = 
3> — T 


We will not make this notion of rewiring precise here but refer the reader to [9] 
for the details. The intuition is simply that a rewiring partitions the ports of a 
circuit into two sets—that we call inputs and outputs—and uses eq or pe to 
bend input ports to the left and and output ports to the right. The realisability 
theorem then states that we can always recover a (not necessarily unique) signal 
flow graph from any circuit by performing these operations. 


Theorem 4. [9] Theorem 5] Every circuit in Circ is equivalent to the rewiring 
of a signal flow graph, called its realisation. 


This theorem allows us to extend the notion of inputs and outputs to all 
circuits of Circ. 


Definition 13. A port of a circuit c of Circ is an input (resp. output) port, if 
there exists a realisation for which it is an input (resp. output). 


Note that, since realisations are not necessarily unique, the same port can be 
both an input and an output. Then, the realisability theorem (Theorem [4} says 
that every port is always an input, an output or both (but never neither). 

An output-only port is an output port that is not an input port. Similarly 
an input-only port in an input port that is not an output port. 


Example 8. The left port of the register —-{2)— is input-only whereas its right 
port is output-only. In the identity wire, both ports are input and output ports. 
The single port of o— is output-only ; that of —e is input-only. 


While in the purely linear case, all behaviours are realisable, the general case 
of ACirc is a bit more subtle. To make this precise, we can extend our definition 
of realisability to include affine signal flow graphs. 


Definition 14. A circuit of ACirc is realisable if its ports can be rewired so that 
it is equivalent to a circuit of ASF. 


Example 9. — is realisable; +} is not. 


Notice that Proposition [17] gives the following equivalent semantic criterion 
for realisability. Realisable behaviours are precisely those that map rationals to 
rationals. 


Theorem 5. A circuit c is realisable iff its ports can be partitioned into two 
sets, that we call inputs and outputs, such that the corresponding rewiring of c 
is an affine rational map from inputs to outputs. 
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We offer another perspective on realisability below: realisable behaviours cor- 
respond precisely to those for which the — constants are connected to inputs of 
the underlying Circ-circuit. First, notice that, since 


C SSP and SY 


in alH, we can assume without loss of generality that each circuit contains exactly 
one —. 


Proposition 12. Every circuit c of ACirc is equivalent to one with precisely one 
t— and no —. 


For c: (n, m) a circuit of ACirc, we will call ĉ the circuit of Circ of sort 
(n +1, m) that one obtains by first transforming c into an equivalent circuit 
with a single + and no — as above, then removing this +, and replacing it by 
an identity wire that extends to the left boundary. 


Theorem 6. A circuit c is realisable iff — is connected to an input port of ê. 


7 Conclusion and Future Work 


We introduced the operational semantics of the affine extension of the signal 
flow calculus and proved that contextual equivalence coincides with denotational 
equality, previously introduced and axiomatised in [6]. We have observed that, 
at the denotational level, affinity provides two key properties (Propositions 
and B) for the proof of full abstraction. However, at the operational level, affin- 
ity forces us to consider computations starting in the past (Example |3) as the 
syntax allows terms lacking a proper flow directionality. This leads to circuits 
that might deadlock (Ho in Example [4) or perform some problematic computa- 


tions (,=H@} in Example|5). We have identified a proper subclass of circuits, 
called affine signal flow graphs (Definition [10), that possess an inherent flow 
directionality: in these circuits, the same pathological behaviours do not arise 
(Proposition [9). This class is not too restrictive as it captures all desirable be- 
haviours: a realisability result (Theorem [5p states that all and only the circuits 
that do not need computations to start in the past are equivalent to (the rewiring 
of) an affine signal flow graph. 

The reader may be wondering why we do not restrict the syntax to affine 
signal flow graphs. The reason is that, like in the behavioural approach to control 
theory [33], the lack of flow direction is what allows the (affine) signal flow calcu- 
lus to achieve a strong form of compositionality and a complete axiomatisation 
(see [9] for a deeper discussion). 

We expect that similar methods and results can be extended to other models 
of computation. Our next step is to tackle Petri nets, which, as shown in [5], can 
be regarded as terms of the signal flow calculus, but over N rather than a field. 
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Abstract. We study the synthesis problem for systems with a parame- 
terized number of processes. As in the classical case due to Church, the 
system selects actions depending on the program run so far, with the aim 
of fulfilling a given specification. The difficulty is that, at the same time, 
the environment executes actions that the system cannot control. In con- 
trast to the case of fixed, finite alphabets, here we consider the case of 
parameterized alphabets. An alphabet reflects the number of processes, 
which is static but unknown. The synthesis problem then asks whether 
there is a finite number of processes for which the system can satisfy the 
specification. This variant is already undecidable for very limited logics. 
Therefore, we consider a first-order logic without the order on word posi- 
tions. We show that even in this restricted case synthesis is undecidable 
if both the system and the environment have access to all processes. On 
the other hand, we prove that the problem is decidable if the environ- 
ment only has access to a bounded number of processes. In that case, 
there is even a cutoff meaning that it is enough to examine a bounded 
number of process architectures to solve the synthesis problem. 


1 Introduction 


Synthesis deals with the problem of automatically generating a program that 
satisfies a given specification. The problem goes back to Church [9], who formu- 
lated it as follows: The environment and the system alternately select an input 
symbol and an output symbol from a finite alphabet, respectively, and in this 
way generate an infinite sequence. The question now is whether the system has a 
winning strategy, which guarantees that the resulting infinite run is contained in 
a given (w)-regular language representing the specification, no matter how the 
environment behaves. This problem is decidable and very well understood [8,37], 
and it has been extended in several different ways (e.g., [24, 26, 28,36,43]). 

In this paper, we consider a variant of the synthesis problem that allows us 
to model programs with a variable number of processes. As we then deal with 
an unbounded number of process identifiers, a fixed finite alphabet is not suit- 
able anymore. It is more appropriate to use an infinite alphabet, in which every 
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letter contains a process identifier and a program action. One can distinguish 
two cases here. In [16], a potentially infinite number of data values are involved 
in an infinite program run (e.g. by dynamic process generation). In a parameter- 
ized system [4,13], on the other hand, one has an unknown but static number 
of processes so that, along each run, the number of processes is finite. In this 
paper, we are interested in the latter, i.e., parameterized case. Parameterized 
programs are ubiquitous and occur, e.g., in distributed algorithms, ad-hoc net- 
works, telecommunication protocols, cache-coherence protocols, swarm robotics, 
and biological systems. The synthesis question asks whether the system has a 
winning strategy for some number of processes (existential version) or no matter 
how many processes there are (universal version). 


Over infinite alphabets, there are a variety of different specification languages 
(e.g., [5, 11, 12, 19,29, 33,40]). Unlike in the case of finite alphabets, there is no 
canonical definition of regular languages. In fact, the synthesis problem has been 
studied for N-memory automata [7], the Logic of Repeating Values [16], and reg- 
ister automata [15,30,31]. Though there is no agreement on a “regular” automata 
model, first-order (FO) logic over data words can be considered as a canonical 
logic, and this is the specification language we consider here. In addition to 
classical FO logic on words over finite alphabets, it provides a predicate x ~ y 
to express that two events x and y are triggered by the same process. Its two- 
variable fragment FO? has a decidable emptiness and universality problem [5] 
and is, therefore, a promising candidate for the synthesis problem. 


Previous generalizations of Church’s synthesis problem to infinite alphabets 
were generally synchronous in the sense that the system and the environment 
perform their actions in strictly alternating order. This assumption was made, 
e.g., in the above-mentioned recent papers [7, 15, 16,30,31]. If there are several 
processes, however, it is realistic to relax this condition, which leads us to an 
asynchronous setting in which the system has no influence on when the envi- 
ronment acts. Like in [21], where the asynchronous case for a fixed number of 
processes was considered, we only make the reasonable fairness assumption that 
the system is not blocked forever. 


In summary, the synthesis problem over infinite alphabets can be classified 
as (i) parameterized vs. dynamic, (ii) synchronous vs. asynchronous, and (iii) 
according to the specification language (register automata, Logic of Repeating 
Values, FO logic, etc.). As explained above, we consider here the parameter- 
ized asynchronous case for specifications written in FO logic. To the best of our 
knowledge, this combination has not been considered before. For flexible model- 
ing, we also distinguish between three types of processes: those that can only be 
controlled by the system; those that can only be controlled by the environment; 
and finally those that can be triggered by both. A partition into system and 
environment processes is also made in [3,18], but for a fixed number of processes 
and in the presence of an arena in terms of a Petri net. 

Let us briefly describe our results. We show that the general case of the 
synthesis problem is undecidable for FO? logic. This follows from an adaptation 
of an undecidability result from [16,17] for a fragment of the Logic of Repeating 
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Values [11]. We therefore concentrate on an orthogonal logic, namely FO without 
the order on the word positions. First, we show that this logic can essentially 
count processes and actions of a given process up to some threshold. Though 
it has limited expressive power (albeit orthogonal to that of FO7), it leads to 
intricate behaviors in the presence of an uncontrollable environment. In fact, we 
show that the synthesis problem is still undecidable. Due to the lack of the order 
relation, the proof requires a subtle reduction from the reachability problem in 
2-counter Minsky machines. However, it turns out that the synthesis problem is 
decidable if the number of processes that are controllable by the environment 
is bounded, while the number of system processes remains unbounded. In this 
case, there is even a cutoff k, an important measure for parameterized systems 
(cf. [4] for an overview): If the system has a winning strategy for k processes, 
then it has one for any number of processes greater than k, and the same applies 
to the environment. The proofs of both main results rely on a reduction of the 
synthesis problem to turn-based parameterized vector games, in which, similar to 
Petri nets, tokens corresponding to processes are moved around between states. 

The paper is structured as follows. In Section 2, we define FO logic (especially 
FO without word order), and in Section 3, we present the parameterized synthesis 
problem. In Section 4, we transform a given formula into a normal form and 
finally into a parameterized vector game. Based on this reduction, we investigate 
cutoff properties and show our (un)decidability results in Section 5. We conclude 
in Section 6. Some proof details can be found in the long version of this paper [2] 


2 Preliminaries 


For a finite or infinite alphabet X, let X* and X“ denote the sets of finite and, 
respectively, infinite words over X. The empty word is e. Given w € X* UX”, 
let |w| denote the length of w and Pos(w) its set of positions: |w| = n and 
Pos(w) = {1,...,n} if w = o102...0, E€ X*, and |w| = w and Pos(w) = 
{1,2,...} if w € X“. Let wii] be the i-th letter of w for all i € Pos(w). 


Executions. We consider programs involving a finite (but not fixed) number 
of processes. Processes are controlled by antagonistic protagonists, System and 
Environment. Accordingly, each process has a type among T = {s,e,se}, and we 
let Py, F, and Re denote the pairwise disjoint finite sets of processes controlled 
by System, by Environment, and by both System and Environment, respectively. 
We let P denote the triple (BR, Pe, Pe). Abusing notation, we sometimes refer to 
P as the disjoint union PR. UPR. U Re. 

Given any set S, vectors s € ST are usually referred to as triples s = 
(Ss, Se, Sse). Moreover, for s,s’ € NT, we write s < s’ if sọ < sy for all 6 € T. 
Finally, let s + s’ = (ss + 56, Se + Sh, Sse + She). 

Processes can execute actions from a finite alphabet A. Whenever an action 
is executed, we would like to know whether it was triggered by System or by 
Environment. Therefore, A is partitioned into A = A,WA,. Let Xs = As X (PUR) 
and Xe = A, x (Ps U Re). Their union X = ©, U Xe is the set of events. A word 
we X* U SL” is called a P-execution. 
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A, = {a,b} A, = {c,d} 


On — p—+ G46 + @_46_,0__, P_,. 05 


7 4 6 6 7 6 2 


Fig. 1. Representation of P-execution as a mathematical structure 


Logic. Formulas of our logic are evaluated over P-executions. We fix an infinite 
supply V = {a,y,z,...} of variables, which are interpreted as processes from P 
or positions of the execution. The logic FO4[~, <, +1] is given by the grammar 


p == Ox) |a(z)|r=ylaryl|rc<y|tl(z,y)|-~| eve | ary 


where z,y € V, 0 € T, anda € A. Conjunction (A), universal quantification (Y), 
implication (=>), true, and false are obtained as abbreviations as usual. 

Let y € FOg[~, <, +1]. By Free(y) C V, we denote the set of variables that 
occur free in y. If Free(y) = 9), then we call y a sentence. We sometimes write 
p(@1,---,%n) to emphasize the fact that Free(p) C {£1,..., En}. 

To evaluate y over a P-execution w = (a1, p1)(a@2, p2) ..., we consider (P, w) as 
a structure S(p,,) = (P Y Pos(w), Ps, Pe, Pee, (Ra)aca, ~, <, +1) where Pw Pos(w) 
is the universe, R P., and Pe are interpreted as unary relations, Ra is the unary 
relation {i € Pos(w) | a; = a}, < = {(i,7) € Pos(w) x Pos(w) | i < j}, 
+1 = {(t,4+1)|1<2%< |w|}, and ~ is the smallest equivalence relation over 
P w Pos(w) containing 


— (p,t) for all p € P andi € Pos(w) such that p = pi, and 
— (i, j) for all (i,j) € Pos(w) x Pos(w) such that p; = p; 


An equivalence class of ~ is often simply referred to as a class. Note that it 
contains exactly one process. 


Example 1. Suppose As = {a,b} and A, = {c,d}. Let the set of processes 
P be given by B = {1,2,3}, R = {4,5}, and Re = {6,7,8}. Moreover, let 
w = (a, 1)(b,8)(d, 7)(c, 4)(a, 6) (c, 6)(a, 7) (d, 6)(b, 2)(d, 7)(a, 7) € X*. Figure 1 il- 
lustrates S(p w). The edge relation represents +1, its transitive closure is <. < 


An interpretation for (P, w) is a partial mapping J : YV > PU Pos(w). Sup- 
pose y € FOg[~, <, +1] such that Free(y) C dom(J). The satisfaction relation 
(P, w), I = y is then defined as expected, based on the structure S(p w) and in- 
terpreting free variables according to I. For example, let w = (a1, p1)(a2, p2)... 
and i € Pos(w). Then, for I(x) = i, we have (P, w), I — a(x) if a; =a. 

We identify some fragments of FO4|~, <, +1]. For R C {~, <, +1}, let FO4[R] 
denote the set of formulas that do not use symbols in {~, <,+1}\ R. Moreover, 
FO4[R] denotes the fragment of FO4[R] that uses only two (reusable) variables. 
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Let p(a1,.--,%n,y) E FOa[~, <, +1] be a formula and m € N. We use 


q2™y.(x1,...,2n,y) as an abbreviation for 
Sy. tyme A alte=a)A A eG. tuys 
1l<i<j<m 1<i<m 


if m > 0, and 47°y.p(a1,...,2n,y) = true. Thus, J2™y.y says that there are at 
least m distinct elements that verify y. We also use J~™y.y as an abbreviation 
for J27™y.p A752™*1y.~. Note that y € FO,[R] implies that 37” y.p € FO,[R] 
and 3=™y.y € FO,[R]. 


Example 2. Let A, P, and w be like in Example 1 and Figure 1. 


— pı = Vax.((s(x) V se(z)) => Iy.(x ~ y A (aly) V b(y)))) says that each 
process that System can control executes at least one system action. We 
have yı € FO4[~] and (P, w) 91, as process 3 is idle. 

— p2 =Va.(d(z) => Iy.(x ~ yA a(y))) says that, for every d, there is an a 
on the same process. We have y2 € FO4[~] and (P, w) = y2. 

— p3 =Ve2.(d(z) => Iy.(x ~ y^x < yAa(y))) says that every d is eventually 
followed by an a executed by the same process. We have y3 € FO4|~, <] 
and (P, w) jÆ p3: The event (d,6) is not followed by some (a, 6). 

— p4 = Va.((A-?y.(c ~ yA aly))) => (Ay. ~ y Ad(y)))) says that 
each class contains exactly two occurrences of a iff it contains exactly two 
occurrences of d. Moreover, p4 € FO,[~] and (P, w) = v4. Note that p4 € 
FO%[~], as 3=?y requires the use of three different variable names. < 


3 Parameterized Synthesis Problem 


We define an asynchronous synthesis problem. A P-strategy (for System) is a 
mapping f : ©* > X; U {e}. A P-execution w = o102... € X* U XY is f- 
compatible if, for all i € Pos(w) such that o; € Xs, we have f(o1...0i—1) = Gi. 
We call w f-fair if the following hold: (i) If w is finite, then f(w) = €, and (it) 
if w is infinite and f(o1...o;-1) Æ £ for infinitely many i > 1, then a; € X; for 
infinitely many j > 1. 

Let y € FOa[~, <, +1] be a sentence. We say that f is P-winning for ọ if, 
for every P-execution w that is f-compatible and f-fair, we have (P, w) = y. 

The existence of a P-strategy that is P-winning for a given formula does not 
depend on the concrete process identities but only on the cardinality of the sets 
P, FR, and Re. This motivates the following definition of winning triples for a 
formula. Given y, let Win(y) be the set of triples (ks, ke, kse) € NT for which 
there is P = (F, Pe, Re) such that |P9| = ko for all 8 € T and there is a P-strategy 
that is P-winning for y. 

Let 0 = {0} and ke, kse € N. In this paper, we focus on the intersection of 
Win(y) with the sets N x O x O (which corresponds to the usual satisfiability 
problem); N x {ke} x {kse} (there is a constant number of environment and 
mixed processes); N x N x {kse} (there is a constant number of mixed processes); 
0 x 0 x N (each process is controlled by both System and Environment). 
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Definition 3 (synthesis problem). For fixed § € {FO,FO*}, set of relation 
symbols R C {~,<, +1}, and Ns, Ne, Nse C N, the (parameterized) synthesis 
problem is given as follows: 

SynTH(§[R], Ns, Ne, Nse) 

Input: A= A; W A, and a sentence y € F(R] 
Question: Win(y) N (Ns x Ne x Ne) #0? 


The satisfiability problem for §[R] is defined as SYNTH(Ş[R], N, 0, 0). 


Example 4. Suppose As = {a,b} and A, = {c,d}, and consider the formulas 
yi-ya from Example 2. 

First, we have Win(y,) = NT. Given an arbitrary P and any total order E 
over È U Pee, a possible P-strategy f that is P-winning for yı maps w € X* to 
(a, p) if p is the smallest process from P, U Be wrt. E that does not occur in w, 
and that returns € for w if all processes from P, U Fe already occur in w. 

For the three formulas y2, p3, and ya, observe that, since d is an environment 
action, if there is at least one process that is exclusively controlled by Environ- 
ment, then there is no winning strategy. Hence we must have P, = Ø. In fact, 
this condition is sufficient in the three cases and the strategies described below 
show that all three sets Win(y2), Win(y3), and Win(y4) are equal to N x 0 x N. 


— For ge, the very same strategy as for yı also works in this case, producing 
an a for every process in F; U Fe, whether there is a d or not. 

— For p3, a winning strategy f will apply the previous mechanism itera- 
tively, performing (a,p) for p € Pee = {po,---,Pn—1} over and over again: 
f(w) = (a, pi) where i is the number of occurrences of letters from X, mod- 
ulo n. By the fairness assumption, this guarantees satisfaction of y3. A more 
“economical” winning strategy f’ may organize pending requests in terms of 
d in a queue and acknowledge them successively. More precisely, given u € P* 
and ø € X, we define another word u@o € P* by uO(d, p) = u-p (inserting p 
in the queue) and (p-u)©(a, p) = u (deleting it). In all other cases, uOc = u. 
Let w = 01 ... 0n E X*, with queue ((€ © 01) O02...) © On = py... Pp. We 
let f'(w) =e if k =0, and f’(w) = (a,pi) if k > 1. 

— For ya, the strategy f’ for p3 ensures that every d has a corresponding a so 
that, in the long run, there are as many a’s as d’s in every class. < 


Another interesting question is whether System (or Environment) has a win- 
ning strategy as soon as the number of processes is big enough. This leads to the 
notion of a cutoff (cf. [4] for an overview): Let M, Me, Me C N and W C NT. We 
call ko € NT a cutoff of W wrt. (Ns, Ne, Nze) if ko E€ M; xX Ne X Nee and either 


— for all k € M; x Ne X Nee such that k > ko, we have k € W, or 
— for all k € M; X Ne x Nse such that k > ko, we have k ¢ W. 


Let § € {FO,FO’°} and R C {~, <, +1}. If, for every alphabet A = A, W Ae 
and every sentence y € §,4[R], the set Win(p) has a computable cutoff wrt. 
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Table 1. Summary of results. Our contributions are highlighted in bold. 


Synthesis (N, 0,0) (N, {ke}, {Kse}) (N,N, 0) (0, 0, N) 
FO? [~, <, +1] decidable [5] ? ? undecidable 
FO?[x, <] NEXPTIME-c. [5] ? ? ? 


FO[~] decidable decidable ?* undecidable 


“We show, however, that there is no cutoff. 


(Ns, Ne, Nse), then we know that SYNTH(Ẹ[R], Ns, Ne, Nse) is decidable, as it 
can be reduced to a finite number of simple synthesis problems over a finite 
alphabet. The latter can be solved, e.g., using attractor-based backward search 
(cf. [42]). This is how we will show decidability of SYNTH(FO[~], N, {ke}, {Kse }) 
for all ke, kse € N. 


Our contributions are summarized in Table 1. Note that known satisfiability 
results for data logic apply to our logic, as processes can be simulated by treating 
every 6 € T as an ordinary letter. Let us first state undecidability of the general 
synthesis problem, which motivates the study of other FO fragments. 


Theorem 5. The problem SYNTH(FO?[~, <,+1],0,0,N) is undecidable. 


Proof (sketch). We adapt the proof from [16,17] reducing the halting problem 
for 2-counter machines. We show that their encoding can be expressed in our 
logic, even if we restrict it to two variables, and can also be adapted to the 
asynchronous setting. 


4 FO[~] and Parameterized Vector Games 


Due to the undecidability result of Theorem 5, one has to switch to other frag- 
ments of first-order logic. We will henceforth focus on the logic FO[~] and es- 
tablish some important properties, such as a normal form, that will allow us to 
deduce a couple of results, both positive and negative. 


4.1 Satisfiability and Normal Form for FO[~] 


We first show that FO[~] logic essentially allows one to count letters in a class 
up to some threshold, and to count such classes up to some other threshold. 
Let B € N and £ € {0,..., B}4. Intuitively, (a) imposes a constraint on the 
number of occurrences of a in a class. We first define an FO,4[~]-formula wz ely) 
verifying that, in the class defined by y, the number of occurrences of each letter 
a € A, counted up to B, is L(a): 


vBely) = i Iz. (y EAN a(z)) A \ 524) z (y ~ zh a(z)) 


ac | ac | 
L(a)<B £(a)=B 
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Theorem 6 (normal form for FO[~]). Let p € FO,|~] be a sentence. There 
is a computable B € N such that p is effectively equivalent to a disjunction of 
conjunctions of formulas of the form Iy. (0(y) A YB e(y)) where r € {>, =}, 
meN, O€T, and Ze {0,...,B}4. 


The normal form can be obtained using known normal-form constructions 
[23,41] for general FO logic [2], or using Ehrenfeucht-Fraissé games [39], or using 
a direct inductive transformation in the spirit of [23]. 


Example 7. Recall the formula y4 = W2.((4-?y.(x ~ yA a(y))) => (4?y.(a ~ 
y A d(y)))) € FOa[~] from Example 2, over As = {a,b} and Ae = {c,d}. An 
equivalent formula in normal form is %4 = Ager, vez J-°y.(A(y) Av3,e(y)) where 
Z is the set of vectors £ € {0,...,3}4 such that (a) = 2 4 &(d) or &(d) =24 
L(a). The formula indeed says that there is no class with =2 occurrences of a 
and #2 occurrences of d or vice versa, which is equivalent to 4. < 


Thanks to the normal form, it is sufficient to test finitely many structures to 
determine whether a given formula is satisfiable: 


Corollary 8. The satisfiability problem for FO[~] over data words is decidable. 
Moreover, every satisfiable FOa[~] formula has a finite model. 


Note that the satisfiability problem for FO?[~] is already NEXPTIME-hard, 
due to NEXPTIME-hardness for two-variable logic with unary relations only [14, 
20,22]. In fact, it is NEXPTIME-complete due to the upper bound for FO?[~, <] 
[5]. It is worth mentioning that two-variable logic with one equivalence relation 
on arbitrary structures also has the finite-model property [32]. 


4.2 From Synthesis to Parameterized Vector Games 


Exploiting the normal form for FO,4[~], we now present a reduction of the syn- 
thesis problem to a strictly turn-based two-player game. This game is conceptu- 
ally simpler and easier to reason about. The reduction works in both directions, 
which will allow us to derive both decidability and undecidability results. 

Note that, given a formula y € FO,[~] (which we suppose to be in normal 
form with threshold B), the order of letters in an execution does not matter. 
Thus, given some P, a reasonable strategy for Environment would be to just “wait 
and see”. More precisely, it does not put Environment into a worse position if, 
given the current execution w € X*, it lets the System execute as many actions 
as it wants in terms of a word u E€ X. Due to the fairness assumption, System 
would be able to execute all the letters from u anyway. Environment can even 
require System to play a word u such that (P, wu) = y. If System is not able to 
produce such a word, Environment can just sit back and do nothing. Conversely, 
upon wu satisfying y, Environment has to be able to come up with a word 
v € Xž such that (P,wuv) KF vy. This leads to a turn-based game in which 
System and Environment play in strictly alternate order and have to provide a 
satisfying and, respectively, falsifying execution. 
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In a second step, we can get rid of process identifiers: According to our 
normal form, all we are interested in is the number of processes that agree 
on their letters counted up to threshold B. That is, a finite execution can be 
abstracted as a configuration C : L => NT where L = {0,..., B}4. For l € L and 
C(L) = (Ns, Ne, Nse), Mo is the number of processes of type 6 whose letter count 
up to threshold B corresponds to £. We can also say that £ contains ng tokens 
of type 0. If it is System’s turn, it will pick some pairs (¢,¢’) and move some 
tokens of type 0 € {s,se} from £ to é’, provided (a) < é(a) for all a € A, and 
L(a) = l(a) for all a € Ae. This actually corresponds to adding more system 
letters in the corresponding processes. The Environment proceeds analogously. 

Finally, the formula y naturally translates to an acceptance condition F C €% 
over configurations, where € is the set of local acceptance conditions, which are of 
the form (ng , XeNe , XIsense) Where Xe, Xe, Xse E {=, >} and ng, Ne, Nse E N. 

We end up with a turn-based game in which, similarly to a VASS game [1,6, 
10,27,38], System and Environment move tokens along vectors from L. Note that, 
however, our games have a very particular structure so that undecidability for 
VASS games does not carry over to our setting. Moreover, existing decidability 
results do not allow us to infer our cutoff results below. 

In the following, we will formalize parameterized vector games. 


Definition 9. A parameterized vector game (or simply game) is given by a 
triple G = (A, B, F) where A= As W A, is the finite alphabet, B € N is a bound, 
and, letting L = {0,...,B}4, F C €} is a finite set called acceptance condition. 


Locations. Let lo be the location such that é)(a) = 0 for all a € A. For £ € L 
and a € A, we define £ +a by (£ + a)(b) = (b) for b £ a and (€ + a)(b) = 
max{¢(a) + 1,B} otherwise. This is extended for all u € A* and a € A by 
L+e = Land £+ ua = (l+ u) +a. By {w}, we denote the location lo + w. 


Configurations. As explained above, a configuration of G is a mapping © : L > 
NT. Suppose that, for 2 € L and 0 € T, we have C(£) = (Nns, ne, nse). Then, we 
let C (£, 0) refer to ng. By Conf, we denote the set of all configurations. 


Transitions. A system transition (respectively environment transition) is a map- 
ping T : Lx L + (Nx {0} xN) (respectively 7 : Lx L — ({0}xNxN)) such that, 
for all (¢,¢’) € Lx L with T(£, ¢’) # (0,0,0), there is a word w € Až (respectively 
w € A®) such that ¢’ = £+ w. Let T, denote the set of system transitions, Te the 
set of environment transitions, and T = T; U Te the set of all transitions. 

For 7 € T, let the mappings out,, inp : L + NT be defined by out,(£) = 
X veL Tf) and in-(£) = X ver Tl, 4) (recall that sum is component-wise). 
We say that r € T is applicable at C € Conf if, for all £ € L, we have out- (£) < 
C(L) (component-wise). Abusing notation, we let T(C') denote the configuration 
C’ defined by C’(£) = C(€) — out, (£) + in,(€) for all £ € L. Moreover, for 
T(L, L) = (Ns, Ne, Nse) and 0 € T, we let r(¢, l, 0) refer to ng. 


Plays. Let C € Conf. We write C — F if there is k € F such that, for all 
LE L, we have C(£) H «(@) (in the expected manner). A C-play, or simply play, 
is a finite sequence m = CoTıC1T2C2 . .. TnCn alternating between configurations 
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and transitions (with n > 0) such that Co = C and, for all i € {1,...,n}, 
C; = Ti(Ci-1) and 

— if i is odd, then 7; € T, and C; | F (System’s move), 

— if i is even, then 7; € Tẹ and C; / F (Environment’s move). 


The set of all C-plays is denoted by Playso. 


Strategies. A C-strategy for System is a partial mapping f : Playso > Ts 
such that f(C) is defined and, for all m = ComC...7;C; € Plays with T = 
f(z) defined, we have that 7 is applicable at C; and 7(C;) = F. Play m = 
Cori C1 see Ti Cys is 


— f-compatible if, for all odd i € {1,...,n}, Ti = f(Com1C1...7;-1Ci-1), 
— f-mazimal if it is not the strict prefix of an f-compatible play, 


— winning if Cn E F. 


We say that f is winning for System (from C) if all f-compatible f-maximal C- 
plays are winning. Finally, C is winning if there is a C-strategy that is winning. 
Note that, given an initial configuration C, we deal with an acyclic finite reach- 
ability game so that, if there is a winning C-strategy, then there is a positional 
one, which only depends on the last configuration. 

For k € NT, let Ck denote the configuration that maps 4o to k and all other 
locations to (0,0,0). We set Win(G) = {k € NT | Cy is winning for System}. 


Definition 10 (game problem). For sets Ns, Ne, Nse C N, the game problem 
is given as follows: 


GAME(N4, Ne, Nse) 
Input: Parameterized vector game G 


Question: Win(G) N (N; x Ne x Nee) #0? 


One can show that parameterized vector games are equivalent to the synthesis 
problem in the following sense: 


Lemma 11. For every sentence p € FOa|~], there is a parameterized vector 
game G = (A, B, F) such that Win(y) = Win(G). Conversely, for every param- 
eterized vector game G = (A, B, F), there is a sentence p € FOa[~] such that 
Win(G) = Win(y). Both directions are effective. 


Example 12. To illustrate parameterized vector games and the reduction from 
the synthesis problem, consider the formula 94 = Ager, rez =y. (0(y) Aw3,e(y)) 
in normal form from Example 7. For simplicity, we assume that A, = {a} and 
A. = {d}. That is, Z is the set of vectors {atdi} € L = {0,...,3}{*% such 
that i = 2 Æ j or j = 2 Æ i. Figure 2 illustrates a couple of configurations 
Co,-..,Cs : L + NT. The leftmost location in a configuration is lọ, the rightmost 
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Co Ti Cı T2 Co 


System Environment 
a nx rx gy 
Ne as Nien ai seNe aio 
p g _ r po os E 
ro poe se Pee 


1 an 
Hosso more memo 
` q ` d ` q 
System Environment System 
T3 C. 3 T4 C4 T5 Cs 


Fig. 2. A play of a parameterized vector game 


location (a*d*)), the topmost one (a*)), and the one at the bottom (d?)). Self- 
loops have been omitted, and locations from Z have gray background and a 
dashed border. 

Towards an equivalent game G = (A,3,F), it remains to determine the accep- 
tance condition F. Recall that y4 says that every class contains two occurrences 
of a iff it contains two occurrences of d. This is reflected by the acceptance condi- 
tion F = {k} where «(¢) = (=0,=0,=0) for all £ € Z and «(¢) = (>0, >0, 50) 
for all £ € L\ Z. With this, a configuration is accepting iff no token is on a 
location from Z (a gray location). 

We can verify that Win(G) = Win(y,) =NxOxN. In G, a uniform winning 
strategy f for System that works for all P with P, = Ø proceeds as follows: 
System first awaits an Environment’s move and then moves each token upwards 
as many locations as Environment has moved it downwards. Figure 2 illustrates 
an f-maximal C(60,9)-play that is winning for System. We note that f is a 
“compressed” version of the winning strategy presented in Example 4, as System 
makes her moves only when really needed. < 


5 Results for FO[~] via Parameterized Vector Games 


In this section, we present our results for the synthesis problem for FO[~], which 
we obtain showing corresponding results for parameterized vector games. In 
particular, we show that (FO[~],0,0,N) and (FO[~],N,N,0) do not have a 
cutoff, whereas (FO[~], N, {ke}, {kse}) has a cutoff for all ke, kse € N. Finally, we 
prove that SYNTH(FO[~], 0,0, N) is, in fact, undecidable. 


Lemma 13. There is a game G = (A, B, F) such that Win(G) does not have a 
cutoff wrt. (0,0,N). 


Proof. We let A, = {a} and A, = {b}, as well as B = 2. For k € {0,1, 2}, define 
the local acceptance conditions =k = (=0,=0,=k) and 7k = (=0,=0,>k). Set 


108 B. Bérard et al. 


Fig. 3. Acceptance conditions for a game with no cutoff wrt. (0,0, N) 


L = (a), l2 = (ab), £3 = (a?b)), and 44 = (ab). For ko,...,ka € {0,1,2} and 
Do, 2. Da E {=, >}, let Pk , k1 ke , “kg ,™4 k4] denote k € €- where 
k(l) = ki) for all i € {0,...,4} and «(¢’) = (70) for l! ¢ {o,..., C4}. Finally, 


=0.=0.=0.=0. 2 
0) [F0,=0,=0,=0, Rug 


where Ke = {xe | L E€ L such that ¢(b) > l(a)} with e(l) = (71) if @ = £, and 
ke(l) = (20) otherwise. This is illustrated in Figure 3. 

There is a winning strategy for System from any initial configuration of size 
2n: Move two tokens from fọ to 4, wait until Environment sends them both to 
fg, then move them to £3, wait until they are moved to @4, then repeat with two 
new tokens from fọ until all the tokens are removed from f9, and Environment 
cannot escape F anymore. However, one can check that there is no winning 
strategy for initial configurations of odd size. 


Lemma 14. There is a game G = (A, B, F) such that Win(G) does not have a 
cutoff wrt. (N,N, 0). 


Proof. We define G such that System wins only if she has at least as many 
processes as Environment. Let As = {a}, A. = {b}, and B = 2. As there are no 
shared processes, we can safely ignore locations with a letter from both System 
and Environment. We set F = {k1, K2, K3, K4} where 


K1((a)) = (=1,=0,=0)  ke((a)) =(=1,=0,=0) K3((a)) = (=0,=0,=0) 
K1((b)) = (=0,=0,=0) —K2((b)) = (=0,22,=0) #3((b)) = (=0,21,=0), 


ka(lp) = (=0,=0,=0), and «;(’) = (20, >0,=0) for all other @’ € L and 
i € {1,2,3,4}. 


We now turn to the case where the number of processes that can be trig- 
gered by Environment is bounded. Note that similar restrictions are imposed 
in other settings to get decidability, such as limiting the environment to a fi- 
nite (Boolean) domain [16] or restricting to one environment process [3,18]. We 
obtain decidability of the synthesis problem via a cutoff construction: 


Parameterized Synthesis for First-Order Logic over Data Words 109 


Theorem 15. Given ke, kse E N, every game G = (A, B, F) has a cutoff wrt. 
(N, {ke}, {kse}). More precisely: Let K be the largest constant that occurs in F. 
Moreover, let Max = (ke + kse): |Ae|: B and N = |L|M@¢*+1.k. Then, (N, ke, kse) 
is a cutoff of Win(G) wrt. (N, {ke}, {kse}). 
Proof. We will show that, for all N > Ñ, 
(N, ke, kse) E€ Win(G) <= (N +1, ke, kse) E€ Win(G). 

The main observation is that, when C contains more than K tokens in a given 
l € L, adding more tokens in @ will not change whether C — F. Given C, C” € 
Conf, we write C <e Cif C AC’ and there is 7T € Te such that 7(C’) = C’. Note 
that the length d of a chain Co <e Ci <e ... <e Ca is bounded by Maz. In other 
words, Maz is the maximal number of transitions that Environment can do in a 
play. For all d € {0,..., Max}, let Conf, be the set of configurations C € Conf 
such that the longest chain in (Conf, <e) starting from C has length d. 


Claim. Suppose that C € Conf, and £ € L such that C(4) = (N, ne, nse) with 
N > |L|*+1. K and ne, nse € N. Set D = Cl£ (N +1, ne, nse)]. Then, 


C is winning for System <= D is winning for System. 


To show the claim, we proceed by induction on d € N, which is illustrated in 
Figure 4. In each implication, we distinguish the cases d = 0 and d > 1. For the 
latter, we assume that equivalence holds for all values strictly smaller than d. 

For T € T, and £,f € L, we let r[(¢,@’,s)++] denote the transition 7 € T; 
given by n(l1, £2,¢€) = T (£1, £,¢€) = 0, n(l, £2, se) = T(l1, £2, 5€), n(l1, £2, 8) = 
T (C4, 0, s) +1 if (4, £2) = AAT and n(e1, l2, s) = T(&,b2,s) if (4, £2) £ (L, L). 
We define 7[(¢, ¢’, s)--] similarly (provided 7(4, ¢’,s) > 1). 


=>: Let f be a winning strategy for System from C € Conf. Let r’ = f(C) 
and C’ = 7/(C). Note that C’ E F. Since C(f,s) = N > |L|**+1 - K, there is 
V € L such that +w = for some w € Aë and C'(f’,s) = N’ > |L|¢- K. 

We show that D = C|[£ = (N +1, ne, nse)] is winning for System by exhibiting 
a corresponding winning strategy g from D that will carefully control the position 
of the additional token. First, set g(D) = n’ where 7’ = 7'[(¢, 7 ,s)+4. Let D' = 
n'(D). We obtain D’(¢',s) = N’ + 1. Note that, since N’ > K, the acceptance 
condition F cannot distinguish between C’ and D’. Thus, we have D' E F. 


Case d = 0: As, for all transitions n” € Te, we have ņn”(D') = D' — F, we 
reached a maximal play that is winning for System. We deduce that D is 
winning for System. 


Case d > 1: Take any 7” € Tẹ and D” such that D” = n” (D') |K F. Let T” = 1" 
and C” = 7"(C"). Note that D” = C” |(V,s) = N +1], C” = D”|(V,s) => 
N], and C”, D” € Conf,- for some d~ < d. As f is a winning strategy 
for System from C, we have that C” is winning for System. By induction 
hypothesis, D” is winning for System, say by winning strategy g”. We let 
g(Dy' D'n” x) = g” (m) for all D”-plays r. For all unspecified plays, let g 
return any applicable system transition. Altogether, for any choice of n”, we 
have that g” is winning from D”. Thus, g is a winning strategy from D. 
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Co) 
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Fig. 4. Induction step in the cutoff construction 


<=: Suppose g is a winning strategy for System from D. Thus, for n = g(D) 
and D' = n/(D), we have D’ — F. Recall that D(¢,s) > (|L|¢+1-K) +1. We 
distinguish two cases: 


1. Suppose there is ¢’ € L such that 4 4 V, D'(’,s) = N’ +41 for some 
N' > |L|¢. K, and n'(€,@’,s) > 1. Then, we set r’ = 7/[(¢, £’,s)--]. 

2. Otherwise, we have D’(é,s) > (|L|¢. K) +1, and we set 7’ = 7’ (as well as 
U =l and N’ = N). 


Let C’ = 7'(C). Since D’ — F, one obtains C’ = F. 


Case d = 0: For all transitions T” € Tę, we have 7/(C’) = OC’ H F. Thus, we 
reached a maximal play that is winning for System. We deduce that C is 
winning for System. 


Case d > 1: Take any T” € Te such that C” = T”(C'") E F. Let n” = T” and 
D” = 7"(D'). We have C” = D"((',s) => N'], D” = C"[(l',s) BH N' +1], 
and C”, D” € Conf- for some d~ < d. As D” is winning for System, by 
induction hypothesis, C” is winning for System, say by winning strategy f”. 
We let f(C 7’ C'T” t) = f(z) for all C’-plays m. For all unspecified plays, 
let f return an arbitrary applicable system transition. Again, for any choice 
of T”, f” is winning from C”. Thus, f is a winning strategy from C. 


This concludes the proof of the claim and, therefore, of Theorem 15. 


Corollary 16. Let ke, ks. E N be the number of environment and the num- 
ber of mixed processes, respectively. The problems GAME(N, {ke}, {kse}) and 
SYNTH(FO[~], N, {ke}, {Kse}) are decidable. 
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In particular, by Theorem 15, the game problem can be reduced to an ex- 
ponential number of acyclic finite-state games whose size (and hence the time 
complexity for determining the winner) is exponential in the cutoff and, there- 
fore, doubly exponential in the size of the alphabet, the bound B, and the fixed 
number of processes that are controllable by the environment. 


Theorem 17. GAME(0,0,N) and SYNTH(FO[4], 0,0, N) are undecidable. 


Proof. We provide a reduction from the halting problem for 2-counter machines 
(2CM) to GAME(0,0,N). A 2CM M = (Q,/,c1,¢2,¢0,¢n) has two counters, 
cı and c2, a finite set of states Q, and a set of transitions A C Q x Op x Q 
where Op = {cj++, Ci, G==0 | i € {1,2}}. Moreover, we have an initial 
state qo E€ Q and a halting state qp E€ Q. A configuration of M is a triple 
y = (q, n1, V2) E Q x N x N giving the current state and the current respective 
counter values. The initial configuration is yo = (qo,0,0) and the set of halting 
configurations is F = {qa} x N x N. For t € A, configuration (q',v{,v5) is a 
(t-)successor of (q, v1, v2), written (q, v1, V2) Fe (q', vi, vi), if there is i € {1,2} 
such that v4_; = v3—; and one of the following holds: (i) t = (q,ci++,q') and 
vi = vi + 1, or (ii) t = (¢,¢;--,q’) and v; = vi — 1, or (iii) t = (¢,c¢;==0,¢') and 
vi = v; = 0. Arun of M is a (finite or infinite) sequence Yo Fz, Yı Fi, -.-. The 
2CM halting problem asks whether there is a run reaching a configuration in F. 
It is known to be undecidable [34]. 


We fix a 2CM M = (Q, A, c1, c2, q0, qn). Let As = QU AU {a1, a2} and A, = 
{b} with a1, a2, and b three fresh symbols. We consider the game G = (A, B, F) 
with A = AsH Ae, B = 4, and F defined below. Let L = {0,..., B}4. Since there 
are only processes shared by System and Environment, we alleviate notation and 
consider that a configuration is simply a mapping C : L > N. From now on, to 
avoid confusion, we refer to configurations of the 2CM M as M-configurations, 
and to configurations of G as G-configurations. 

Intuitively, every valid run of M will be encoded as a play in G, and the 
acceptance condition will enforce that, if a player in G deviates from a valid 
play, then she will lose immediately. At any point in the play, there will be at 
most one process with only a letter from Q played, which will represent the 
current state in the simulated 2CM run. Similarly, there will be at most one 
process with only a letter from A to represent what transition will be taken 
next. Finally, the value of counter c; will be encoded by the number of processes 
with exactly two occurrences of a; and two occurrences of b (i.e., C((a?b?)))). 

To increase counter c;, the players will move a new token to (a?b?)), and to 
decrease it, they will move, together, a token from (a?b?)) to (a?b*+)). Observe 
that, if c; has value 0, then C((a?b?))) = 0 in the corresponding configuration 
of the game. As expected, it is then impossible to simulate the decrement of 
c;. Environment’s only role is to acknowledge System’s actions by playing its 
(only) letter when System simulates a valid run. If System tries to cheat, she 
loses immediately. 


Encoding an M-configuration. Let us be more formal. Suppose y = (q, 11, V2) is 
an M-configuration and C a G-configuration. We say that C encodes y if 
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C((q)) = 1, C((azb*)) = ri, C((azb’)) = v2, 
C (£) = 0 for all £ € {40} U {(q?b?), (0), (azb*) | Ge Q,t € A,i € {1, 2}}, 
C(€) = 0 for all other £ € L. 


We then write y = m(C). Let C(y) be the set of G-configurations C that en- 
code y. We say that a G-configuration C is valid if C € C(y) for some y. 


Simulating a transition of M. Let us explain how we go from a G-configuration 
encoding y to a G-configuration encoding a successor M-configuration y’. Ob- 
serve that System cannot change by herself the M-configuration encoded. If, for 
instance, she tries to change the current state q, she might move one process from 
Llo to (q’)), but then the G-configuration is not valid anymore. We need to move 
the process in {q} into (q?b?)) and this requires the cooperation of Environment. 

Assume that the game is in configuration C encoding y = (q,11, V2). System 
will pick a transition t starting in state q, say, t = (g,ci1++,q'). From con- 
figuration C, System will go to the configuration Cı defined by C)((t))) = 1, 
C;((a1))) = 1, and C1 (2) = C(£) for all other £ € L. 

If the transition t is correctly chosen, Environment will go to a configura- 
tion Cy defined by C2((q)) = 0, Co((gb)) = 1, Co((t)) = 0, Co((eb)) = 1, 
C2((a1))) = 0, Co((aib))) = 1 and, for all other @ € L, Co(¢) = Cı(4. This 
means that Environment moves processes in locations (t)), (q)), (a1) to loca- 
tions (tb), (qb), (a1b)), respectively. 

To finish the transition, System will now move a process to the destination 
state q’ of t, and go to configuration C3 defined by C3((q’})) = 1, C3((tb))) = 0, 
C3(((t?b)) = 1, C3((qb)) = 0, C3((q7b)) = 1, C3((a1b)) = 0, Ca(Kazd)) = 1, 
and C3(¢) = C2(£) for all other £ € L. 

Finally, Environment moves to configuration C4 given by C4((t?b))) = 0, 
C4((t?b?)) = C3((t?b*)) + 1, Ca((q?b)) = 0, Ca((q?b?)) = C3((q?b*)) + 1, 
C4((a?b))) = 0, C4((azb?))) = C3((a?b?))) + 1, and C4(£) = C3(@) for all other 
LE L. Observe that Cy E€ C((q’, + 1, v2)). 

Other types of transitions will be simulated similarly. To force System to 
start the simulation in yọ, and not in any M-configuration, the configurations 
C such that C((q@b?))) = 0 and C((q))) = 1 for q Æ qo are not valid, and will be 
losing for System. 


Acceptance condition. It remains to define F in a way that enforces the above 

sequence of G-configurations. Let Ly = {fo} U {(a?b?)), (atb*) | i € {1,2}} U 

{4°} | q € Q} U {(t?b?) | t € A} be the set of elements in L whose values do 

not affect the acceptance of the configuration. By [4 p<, n1, ..., lk DXIk Me], we 

denote « € € such that «(é;) = (;n;) for i € {1,...,k} and doS (=0) for all 

LE L\{4,..., lk}. Moreover, for a set of legakan LCL, we let Ê > 0 stand 
r “(£ > 0) ioe all le Ê”. 

First, we force Environment to play only in response to System by making 
System win as soon as there is a process where Environment has played more 
letters than System (see Condition (d) in Table 2). 

If y is not halting, the configurations in C(y) will not be winning for System. 
Hence, System will have to move to win (Condition (a)). 


Parameterized Synthesis for First-Order Logic over Data Words 113 


Table 2. Acceptance conditions for the game simulating a 2CM 


Requirements for System 


(a) For all t = (q, op, q’) € Q: 


Fan = Ugea{la) = 1, Kt) =1, (a) =1, (PP), (Lr \ P) > OI} if op = cit 
Fan = Uzeo {ila} = 1, Kt) = 1, (Ft?) = 1, (PP) > 1, (Lr \ ((@2*)}) > OI} if op =e. 
Feat) = Uzeo {la = 1, Kt) = 1, (a0?) = 0, (470?) > 1, (Lv \ EE), (a26?)}) > O]} if op = ci==0 


(b) For all t = (qo, op, g’) € Q such that op € {ci++, ci==0}: 
Fi = {[(g0) = 1, (t) = 1, (ai) = 1, fo > 0]} if op = ci++ 
Fi = {[(go) = 1, (t) = 1, £0 > O}} if op = ci== 


(c) For all t = (q,0p,q') € Q: 
Fasan = {b} = 1, (Pb) = 1, (čb) = 1, (a) = 1, Ly 2 0} if op = ci 
Fanan = {b} = 1, (1b) = 1, (att?) = 1, (q') = 1, Ly 2 0} if op = ci 
Fasan = {I0} = 1, (20) = 1, Ly > 0)} P 


Requirements for Environment 


(d) Let Ls<e = {LE L | (Zaca, (a) < €(b)}. For all £ € Lsce: Fe = [E > 1,(L\ {G) > 0] 


(e) For all t = (q,0p,q') € Q: 


[(qb) = 1, (t) = 1, (ai) =1, Ly =O], Ka) =1, (tb) =1, (ai) =1, Lv =O), 
Flat Ka) =1, Kt) =1, (asd) = 1, Ly > 0], [(qb) = 1, (tb) =1, (ai) =1, Ly 2 0], if op = c++ 
Kab) = 1, (t) = 1, (aid) = 1, Ly > 0], Ka) =1, (tb) =1, (aid) = 1, Lv > 0] 
Kab) = 1, (t) = 1, (aib?) =1, Ly > 0), Ka) =1, (tb) =1, (azb?) = 1, Ly > 0), 
Fé.) Ka) =1, (t) =1, (aib?) =1, Ly = 0], [(qb) = 1, (tb) = 1, (ab?) = 1, Ly > 0], $ if op = ci- 
Kab) = 1, (t) = 1, (až?) = 1, Ly > 0], Ka) =1, (tb) =1, Qab’) = 1, Ly > 0) 
F&a = {) = 1, (t) = 1, Ly > 0], Ka) = 1, (tb) = 1, Ly > Oj} if op = ci== 
(£) For all t = (q,op,q') € Q: 
(7) =1, (ab) = 1, (tD) > 0, (azb) > 0, Ly > O] 
Fi 7 (7) =1, (qb) = 0, (tb) = 1, (azb) > 0, Ly > 0] eres 
(aha) V (dq) =1, Kab) = 0, (t7b) > 0, (azb) = 1, Ly > 0] = 
(qb) = 1, (q°b) > 0, (t7b) > 0, (azd) > 0, Ly > 0] 
ld) =1, ab) = 1, Kb) > 0, (aib?) > 0, Ly > 0), 
a 7 ld) =1, (qb) > 0, (tb) = 1, Qab?) > 0, Ly > 0), toca 
(ata) E ) Qa) =1, Qab) > 0, (4b) > 0, (att?) = 1, Ly 20) f 7 PTS 
(q'b) = 1, (q°b) > 0, (tb) > 0, Qao?) > 0, Ly > 0] 
(7) =1, (qb) = 1, (tb) > 0, Ly = 0) 
FE = 4G) =1, (ab) > 0, Ko) =1, Ly > 0), iPopaie se 
'b) = 1, 4g? (Pb) > 0, (ab?) > 0, Ly > 0] 
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The first transition chosen by System must start from the initial state of M. 
This is enforced by Condition (b). 

Once System has moved, Environment will move other processes to leave 
accepting configurations. The only possible move for her is to add b on a pro- 
cess in locations (q), {t}, and (a;), if t is a transition incrementing counter 
ci (respectively (a3b?)) if t is a transition decrementing counter c;). All other 
G-configurations accessible by Environment from already defined accepting con- 
figurations are winning for System, as established in Condition (e). 

System can now encode the successor configuration of M, according to the 
chosen transition, by moving a process to the destination state of the transition 
(see Condition (c)). 

Finally, Environment makes the necessary transitions for the configuration 
to be a valid G-configuration. If she deviates, System wins (see Condition (f)). 

If Environment reaches a configuration in C(y) for y € F, System can win by 
moving the process in (qn) to (q?). From there, all the configurations reachable 
by Environment are also winning for System: 


Fr = {h} =1,Lv 20), Kab) =1,Lv 20), Kh) = 1, Ly = Of. 


Finally, the acceptance condition is given by 


F= |J AU U Foul) Cet Fi U Futa) UFetg)UFF. 
LE Lesce t=(q0,0p,q')EA t=(q,0p,q')EA 


Note that a correct play can end in three different ways: either there is a 
process in {qa} and System moves it to (q2), or System has no transition to 
pick, or there are not enough processes in fọ for System to simulate a new 
transition. Only the first kind is winning for System. 

We can show that there is an accepting run in M iff there is some k such 
that System has a winning C(9.9,x)-strategy for G. 


6 Conclusion 


There are several questions that we left open and that are interesting in their own 
right due to their fundamental character. Moreover, in the decidable cases, it will 
be worthwhile to provide tight bounds on cutoffs and the algorithmic complexity 
of the decision problem. Like in [7, 15, 16,30,31], our strategies allow the system 
to have a global view of the whole program run executed so far. However, it is 
also perfectly natural to consider uniform local strategies where each process only 
sees its own actions and possibly those that are revealed according to some causal 
dependencies. This is, e.g., the setting considered in [3, 18] for a fixed number of 
processes and in [25] for parameterized systems over ring architectures. 

Moreover, we would like to study a parameterized version of the control 
problem [35] where, in addition to a specification, a program in terms of an arena 
is already given but has to be controlled in a way such that the specification is 
satisfied. Finally, our synthesis results crucially rely on the fact that the number 
of processes in each execution is finite. It would be interesting to consider the 
case with potentially infinitely many processes. 
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Abstract. Bertrand et al. introduced a model of parameterised systems, 
where each agent is represented by a finite state system, and studied the 
following control problem: for any number of agents, does there exist a 
controller able to bring all agents to a target state? They showed that 
the problem is decidable and EXPTIME-complete in the adversarial 
setting, and posed as an open problem the stochastic setting, where the 
agent is represented by a Markov decision process. In this paper, we show 
that the stochastic control problem is decidable. Our solution makes 
significant uses of well quasi orders, of the max-flow min-cut theorem, 
and of the theory of regular cost functions. 


1 Introduction 


The control problem for populations of identical agents. The model we study 
was introduced in [3] (see also the journal version [4]): a population of agents 
are controlled uniformly, meaning that the controller applies the same action 
to every agent. The agents are represented by a finite state system, the same 
for every agent. The key difficulty is that there is an arbitrary large number of 
agents: the control problem is whether for every n € N, there exists a controller 
able to bring all n agents synchronously to a target state. 

The technical contribution of [3,4] is to prove that in the adversarial setting 
where an opponent chooses the evolution of the agents, the (adversarial) control 
problem is EXPTIME-complete. 

In this paper, we study the stochastic setting, where each agent evolves in- 
dependently according to a probabilistic distribution, i.e. the finite state system 
modelling an agent is a Markov decision process. The control problem becomes 
whether for every n € N, there exists a controller able to bring all n agents 
synchronously to a target state with probability one. 
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ANR project (ANR-16-CE40-0007). 
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Our main technical result is that the stochastic control problem is decidable. 
In the next paragraphs we discuss four motivations for studying this problem: 
control of biological systems, parameterised verification and control, distributed 
computing, and automata theory. 


Modelling biological systems. The original motivation for studying this model 
was for controlling population of yeasts ([21]). In this application, the concen- 
tration of some molecule is monitored through fluorescence level. Controlling the 
frequency and duration of injections of a sorbitol solution influences the concen- 
tration of the target molecule, triggering different chemical reactions which can 
be modelled by a finite state system. The objective is to control the popula- 
tion to reach a predetermined fluorescence state. As discussed in the conclusions 
of [3,4], the stochastic semantics is more satisfactory than the adversarial one for 
representing the behaviours of the chemical reactions, so our decidability result 
is a step towards a better understanding of the modelling of biological systems 
as populations of arbitrarily many agents represented by finite state systems. 


From parameterised verification to parameterised control. Parameterised verifi- 
cation was introduced in [12]: it is the verification of a system composed of an 
arbitrary number of identical components. The control problem we study here 
and introduced in [3,4] is the first step towards parameterised control: the goal 
is control a system composed of many identical components in order to ensure a 
given property. To the best of our knowledge, the contributions of [3,4] are the 
first results on parameterised control; by extension, we present the first results 
on parameterised control in a stochastic setting. 


Distributed computing. Our model resembles two models introduced for the 
study of distributed computing. The first and most widely studied is popula- 
tion protocols, introduced in [2]: the agents are modelled by finite state systems 
and interact by pairs drawn at random. The mode of interaction is the key 
difference with the model we study here: in a time step, all of our agents per- 
form simultaneously and independently the same action. This brings us closer 
to broadcast protocols as studied for instance in [8], in which one action involves 
an arbitrary number of agents. As explained in [3,4], our model can be seen as 
a subclass of (stochastic) broadcast protocols, but key differences exist in the 
semantics, making the two bodies of work technically independent. 

The focus of the distributed computing community when studying population 
or broadcast protocols is to construct the most efficient protocols for a given 
task, such as (prominently) electing a leader. A growing literature from the 
verification community focusses on checking the correctness of a given protocol 
against a given specification; we refer to the recent survey [7] for an overview. 
We concentrate on the control problem, which can then be seen as a first result 
in the control of distributed systems in a stochastic setting. 


Alternative semantics for probabilistic automata. It is very tempting to con- 
sider the limit case of infinitely many agents: the parameterised control question 
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becomes the value 1 problem for probabilistic automata, which was proved un- 
decidable in [13], and even in very restricted cases ({10]). Hence abstracting 
continuous distributions by a discrete population of arbitrary size can be seen 
as an approximation technique for probabilistic automata. Using n agents cor- 
reponds to using numerical approximation up to 27” with random rounding; 
in this sense the control problem considers arbitrarily fine approximations. The 
plague of undecidability results on probabilistic automata (see e.g. [9]) is nicely 
contrasted by our positive result, which is one of the few decidability results 
on probabilistic automata not making structural assumptions on the underlying 
graph. 


Our results. We prove decidability of the stochastic control problem. The first 
insight is given by the theory of well quasi orders, which motivates the introduc- 
tion of a new problem called the sequential flow problem. The first step of our 
solution is to reduce the stochastic control problem to (many instances of) the 
sequential flow problem. The second insight comes from the theory of regular 
cost functions, providing us with a set of tools for addressing the key difficulty 
of the problem, namely the fact that there are arbitarily many agents. Our key 
technical contribution is to show the computability of the sequential flow prob- 
lem by reducing it to a boundedness question expressed in the cost monadic 
second order logic using the max-flow min-cut theorem. 


Related work. The notion of decisive Markov chains was introduced in [1] as 
a unifying property for studying infinite-state Markov chains with finite-like 
properties. A typical example of decisive Markov chains is lossy channel sys- 
tems where tokens can be lost anytime inducing monotonicity properties. Our 
situation is the exact opposite as we are considering (using the Petri nets ter- 
minology) safe Petri nets where the number of tokens along a run is constant. 
So it is not clear whether the underlying argument in both cases can be unified 
using decisiveness. 


Organisation of the paper. We define the stochastic control problem in Section 2, 
and the sequential flow problem in Section 3. We construct a reduction from the 
former to (many instances of) the latter in Section 4, and show the decidability 
of the sequential flow problem in Section 5. 


2 The stochastic control problem 


Definition 1. A Markov decision process (MDP for short) consists of 


— a finite set of states Q, 
— a finite set of actions A, 
— a stochastic transition table p: Q x A > D(Q). 


The interpretation of the transition table is that from the state p under action 
a, the probability to transition to q is p(p,a)(q). The transition relation A is 
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defined by 
A= {(p,a,q)€ 2x Ax Q: p(p,a)(q) > 0}. 


We also use A, given by {(p,q) E Q x Q : (p,a,q) € A}. 

We refer to [17] for the usual notions related to MDPs; it turns out that very 
little probability theory will be needed in this paper, so we restrict ourselves to 
mentioning only the relevant objects. In an MDP M, a strategy is a function 
a: Q > A; note that we consider only pure and positional strategies, as they 
will be sufficient for our purposes. 

Given a source s E€ Q and a target t € Q, we say that the strategy o almost 
surely reaches t if the probability that a path starting from s and consistent 
with o eventually leads to t is 1. As we shall recall in Section 4, whether there 
exists a strategy ensuring to reach t almost surely from s, called the almost 
sure reachability problem for MDP can be reduced to solving a two player Biichi 
game, and in particular does not depend upon the exact probabilities. In other 
words, the only relevant information for each (p,a,q) E€ Q x A x Q is whether 
p(p,a)(q) > 0 or not. Since the same will be true for the stochastic control 
problem we study in this paper, in our examples we do not specify the exact 
probabilities, and an edge from p to q labelled a means that p(p,a)(q) > 0. 

Let us now fix an MDP M and consider a population of n tokens (we use 
tokens to represent the agents). Each token evolves in an independent copy of 
the MDP M. The controller acts through a strategy o : Q” — A, meaning 
that given the state each of the n tokens is in, the controller chooses one action 
to be performed by all tokens independently. Formally, we are considering the 
product MDP M” whose set of states is Q”, set of actions is A, and transition 
table is p"(u,a)(v) = []_, p(ui,a)(vi), where u,v € Q” and u;, v; are the i 
components of u and v. 

Let s,t € Q be the source and target states, we write s” and t” for the 
constant n-tuples where all components are s and t. For a fixed value of n, 
whether there exists a strategy ensuring to reach t” almost surely from s” can 
be reduced to solving a two player Biichi game in the same way as above for a 
single MDP, replacing M by M”. The stochastic control problem asks whether 
this is true for arbitrary values of n: 


Problem 1 (Stochastic control problem). The inputs are an MDP M, a source 
state s € Q and a target state t € Q. The question is whether for all n € N, 
there exists a strategy ensuring to reach t” almost surely from s”. 


Our main result is the following. 
Theorem 1. The stochastic control problem is decidable. 


The fact that the problem is co-recursively enumerable is easy to see: if the 
answer is “no”, there exists n € N such that there exist no strategy ensuring 
to reach t” almost surely from s”. Enumerating the values of n and solving the 
almost sure reachability problem for M” eventually finds this out. However, it 
is not clear whether one can place an upper bound on such a witness n, which 
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would yield a simple (yet inefficient!) algorithm. As a corollary of our analysis 
we can indeed derive such an upper bound, but it is non elementary in the size 
of the MDP. 

In the remainder of this section we present a few interesting examples. 


Example 1 Let us consider the MDP represented in Figure 1. We show that 
for this MDP, for any n € N, the controller has an almost sure strategy to reach 
t” from s”. Starting with n tokens on s, we iterate the following strategy: 


— Repeatedly play action a until all tokens are in q; 
— Play action b. 


The first step is eventually successful with probability one, since at each iteration 
there is a positive probability that the number of tokens in state q increases. In 
the second step, with non zero probability at least one token goes to t, while the 
rest go back to s. It follows that each iteration of this strategy increases with 
non zero probability the number of tokens in t. Hence, all tokens are eventually 
transferred to t” almost surely. 


Fig. 1. The controller can almost surely reach t” from s”, for any n € N. 


Example 2 We now consider the MDP represented in Figure 2. By convention, 
if from a state some action does not have any outgoing transition (for instance 
the action u from s), then it goes to the sink state L. 

We show that there exists a controller ensuring to transfer seven tokens from 
s to t, but that the same does not hold for eight tokens. For the first assertion, 
we present the following strategy: 


— Play a. One of the states qi for i; € {u,d} receives at least 4 tokens. 

— Play i; € {u,d}. At least 4 tokens go to t while at most 3 go to q1. 

— Play a. One of the states q;? for i2 € {u, d} receives at least 2 tokens. 

Play ig € {u, d}. At least 2 tokens go to t while at most 1 token goes to q2. 
— Play a. The token (if any) goes to q§ for i3 € {u, d}. 
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— Play i3 € {u,d}. The remaining token (if any) goes to t. 


Now assume that there are 8 tokens or more on s. The only choices for a strategy 
are to play u or d on the second, fourth, and sixth move. First, with non zero 
probability at least 4 tokens are in each of qf for i € {u, d}. Then, whatever the 
choice of action i € {u,d}, there are at least 4 tokens in q after the next step. 
Proceeding likewise, there are at least 2 tokens in q2 with non zero probability 
two steps later. Then again two steps later, at least 1 token falls in the sink with 
non zero probability. 


Fig. 2. The controller can synchronise up to 7 tokens on the target state t almost 
surely, but not more. 


Generalising this example shows that if the answer to the stochastic control 
problem is “no”, the smallest number of tokens n for which there exist no almost 
surely strategy for reaching t” from s” may be exponential in |Q|. This can 
further extended to show a doubly exponential in Q lower bound, as done in [3,4]; 
the example produced there holds for both the adversarial and the stochastic 
setting. Interestingly, for the adversarial setting this doubly exponential lower 
bound is tight. Our proof for the stochastic setting yields a non-elementary 
bound, leaving a very large gap. 


Example 3 We consider the MDP represented in Figure 3. For any n € N, 
there exists a strategy almost surely reaching t” from s”. However, this strategy 
has to pass tokens one by one through q1. We iterate the following strategy: 


— Repeatedly play action a until exactly 1 token is in q1. 
— Play action b. The token goes to q; for some i € {l,r}. 
— Play action į € {1,r}, which moves the token to t. 


Note that the first step may take a very long time (the expectation of the number 
of as to be played until this happens is exponential in the number of tokens), 
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but it is eventually successful with probability one. This very slow strategy is 
necessary: if qı contains at least two tokens, then action b should not be played: 
with non zero probability, at least one token ends up in each of qı, qr, so at the 
next step some token ends up in L. It follows that any strategy almost surely 
reaching t” has to be able to detect the presence of at most 1 token in q1. This is 
a key example for understanding the difficulty of the stochastic control problem. 


a,b, l, r 


r% 


Fig. 3. The controller can synchronise any number of tokens almost surely on the target 
state t, but they have to go one by one. 


3 The sequential flow problem 


We let Q be a finite set of states. We call configuration an element of N2 and 


flow an element of f e N2*2. A flow f induces two configurations pre( f) and 
post(f) defined by 


pre(f)(p)= X. f(p.q) and __post(f)(q) = $ fp, 4). 


qEQ pEQ 


Given c, cœ two configurations and f a flow, we say that c goes to c' using f and 
write c>’ c', if c = pre( f) and d = post( f). 

A flow word is f = fi ... fe where each f; is a flow. We write c~/ c if there 
exists a sequence of configurations c = Co,C1,...,c¢ = C such that ci—-1 > fi c 
for alli € {1,..., 2}. In this case, we say that c goes to c’ using the flow word f. 

We now recall some classical definitions related to well quasi orders ([15,16], 
see [19] for an exposition of recent results). Let (E, <) be a quasi ordered set 
(i.e. < is reflexive and transitive), it is a well quasi ordered set (WQO) if any 
infinite sequence contains an increasing pair. We say that S C E is downward 
closed if for any x € S, if y < x then y € S. An ideal is a non-empty downward 
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closed set J C E such that for all x,y € J, there exists some z € I satisfying 
both x < z and y < z. 


Lemma 1. 


— Any infinite sequence of decreasing downward closed sets in a WQO is even- 
tually constant. 

— A subset is downward closed if and only if it is a finite union of incomparable 
ideals. We call it its decomposition into ideals (or simply, its decomposi- 
tion), which is unique (up to permutation). 

— An ideal is included in a downward closed set if and only if it is included in 
one of the ideals of its decomposition. 


We equip the set of configurations NE and the set of flows N°*2 with the 
quasi order < defined component wise, yielding thanks to Dickson’s Lemma [6] 
two WQOs. 


Lemma 2. Let X be a finite set. A subset of N* is an ideal if and only if it is 
of the form 
al= {cE N* |e <a}, 


for some a € (NU {w})* (in which w is larger than all integers). 


We represent downward closed sets of configurations and flows using their 


decomposition into finitely many ideals of the form ay for a € (NU {w})2 or 
a € (NU {w})2*2. 


Problem 2 (Sequential flow problem). Let Q be a finite set of states. Given a 
downward closed set of flows Flows C N2*2 and a downward closed set of final 
configurations F C NÈ, compute the downward closed set 


Pre* (Flows, F) = {cE NS |c» fc EF, fe Flows*} , 


i.e. the configurations from which one may reach F using only flows from Flows. 


4 Reduction of the stochastic control problem to the 
sequential flow problem 


Let us consider an MDP M and a target t € Q. We first recall a folklore result 
reducing the almost sure reachability question for MDPs to solving a two player 
Biichi game (we refer to [14] for the definitions and notations of Büchi games). 
The Biichi game is played between Eve and Adam as follows. From a state p: 


1. Eve chooses an action a and a transition (p,q) € Ag; 
2. Adam can either choose to 
agree and the game continues from q, or 
interrupt and choose another transition (p,q’) € Aq, the game continues 
from g’. 
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The Biichi objective is satisfied (meaning Eve wins) if either the target state t 
is reached or Adam interrupts infinitely many times. 


Lemma 3. There exists a strategy ensuring almost surely to reach t from s if 
and only if Eve has a winning strategy from s in the above Biichi game. 


We now explain how this reduction can be extended to the stochastic control 
problem. Let us consider an MDP M and a target t € Q. We now define an 
infinite Biichi game Gy. The set of vertices is the set of configurations NE. For 
a flow f, we write supp(f) = {(p,¢) € Q? : f(p,q) > 0}. The game is played as 
follows from a configuration c: 


1. Eve chooses an action a and a flow f such that pre(f) = c and supp(f) C Ag. 
2. Adam can either choose to 


agree and the game continues from c’ = post(f) 
interrupt and choose a flow f’ such that pre(f’) = c and supp(f’) C Aa, 
and the game continues from c” = post( f’). 


Note that Eve choosing a flow f is equivalent to choosing for each token a 
transition (p,q) € Aa, inducing the configuration c’, and simiarly for Adam 
should he decide to interrupt. 

Eve wins if either all tokens are in the target state, or if Adam interrupts 
infinitely many times. 

Note that although the game is infinite, it is actually a disjoint union of 
finite games. Indeed, along a play the number of tokens is fixed, so each play is 
included in Q” for some n € N. 


Lemma 4. Let c be a configuration with n tokens in total, the following are 
equivalent: 


— There exists a strategy almost surely reaching t” from c, 
— Eve has a winning strategy in the Biichi game Gm starting from c. 


Lemma 4 follows from applying Lemma 3 on the product MDP M”. 
We also consider the game hy, for i € N, which is defined just as Gm except 


for the winning objective: Eve wins in g® if either all tokens are in the target 
state, or if Adam interrupts more than 7 times. It is clear that if Eve has a 
winning strategy in Gm then she has a winning strategy in ave Conversely, the 
following result states that gt is equivalent to Gm for some i. 


Lemma 5. There exists i € N such that from any configuration c € NÌ, Eve 
has a winning strategy in Gm if and only if Eve has a winning strategy in gË. 
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Proof: Let X® C N2 be the winning region for Eve in g9. We first argue that 
X=N; X is the winning region in Gm. It is clear that X is contained in the 
winning region: if Eve has a strategy to ensure that either all tokens are in the 
target state, or that Adam interrupts infinitely many times, then it particular 
this is true for Adam interrupting more than i times for any 7. The converse 
inclusion holds because Gm is a disjoint union of finite Biichi games. Indeed, in 
a finite Biichi game, since Adam can restrict himself to playing a memoryless 
winning strategy, if Eve can ensure that he interrupts a certain number of times 
(larger than the size of the game), then by a simple pumping argument this 
implies that Adam will interrupt infinitely many times. 

To conclude, we note that each X is downward closed: indeed, a winning 
strategy from a configuration c can be used from a configuration c’ where there 
are fewer tokens in each state. It follows that (X ) iso is a decreasing sequence 
of downward closed sets in NE, hence it stabilises thanks to Lemma 1, i.e. there 
exists ig € N such that Xo) = fM; X, which concludes. 


Note that Lemma 4 and Lemma 5 substantiate the claims made in Section 2: 
pure positional strategies are enough and the answer to the stochastic control 
problem does not depend upon the exact probabilities in the MDP. Indeed, the 
construction of the Biichi games do not depend on them, and the answer to the 
former is equivalent to determining whether Eve has a winning strategy in each 
of them. 

We are now fully equipped to show that a solution to the sequential flow 
problem yields the decidability of the stochastic control problem. 

Let F be the set of configurations for which all tokens are in state t. we let 


xX CN® denote the winning region for Eve in the game g®. Note first that 
X0) = Pre” ( Flows’, F) where 


Flows? = {f € N2*2 : Ja € A, supp(f) C Ag}. 


Indeed, in the game g9 Adam cannot interrupt as this would make him lose 
immediately. Hence, the winning region for Eve in go) is Pre*(Flows®, F). 

We generalise this by setting Flows’ for all i > 0 to be the set of flows f € 
N22 such that for some action a € A, 


— supp(f) C Aq, and 
— for f’ with pre(f’) = pre(f) and supp(f’) C Aa, we have post(f’) € X(—). 


Equivalently, this is the set of flows for which, when played in the game Gry 
by Eve, Adam cannot use an interrupt move and force the configuration outside 
of X=), 
We now claim that 
X = Pre*(Flows', F) 


for all į > 0. 
We note that this means that for each i computing X reduces to solving one 
instance of the sequential flow problem. This induces an algorithm for solving 


Controlling a random population 129 


the stochastic control problem: compute the sequence (X D)o until it stabilises, 
which is ensured by Lemma 5 and yields the winning region of Gm. The answer 
to the stochastic control problem is then whether the initial configuration where 
all tokens are in s belongs to the winning region of Gm. 

Let us prove the claim by induction on 7. 

Let c be a configuration in Pre*(Flows’, F). This means that there exists 
a flow word f = fı--- fe such that fk € Flows’ for all k, and c~ f d eF. 
Expanding the definition, there exist co = c, ..., Ce = c such that ck-1 > ® cy 
for all k. 

Let us now describe a strategy for Eve in gt) starting from c. As long as 
Adam agrees, Eve successively chooses the sequence of flows f1, fo,... and the 
corresponding configurations c,,c2,.... If Adam never interrupts, then the game 
reaches the configuration c’ € F, and Eve wins. Otherwise, as soon as Adam 
interrupts, by definition of Flows’, we reach a configuration d € XC-), By 
induction hypothesis, Eve has a strategy which ensures from d to either reach F 
or that Adam interrupts at least 7 — 1 times. In the latter case, adding the 
interrupt move leading to d yields ¿ interrupts, so this is a winning strategy for 
Eve in gi witnessing that c € X®. 

Conversely, assume that there is a winning strategy o of Eve in Gy, from 
a configuration c. Consider a play consistent with ø, it either reaches F or 
Adam interrupts. Let us denote by f = fi, fo,..., fe the sequence of flows until 
then. We argue that fẹ € Flows’ for k € {1,...,€}. Let f = fp for some k, by 
definition of the game supp(f) C A, for some action a. Let f’ such that pre(f’) = 
pre(f) and supp(f’) C Ag. In the game Gm after Eve played fk, Adam has 
the possibility to interrupt and choose f’. From this configuration onward the 
strategy o is winning in Gan implying that f € Flows’. Thus f = fifo... fe 
is a witness that ce X®. 


5 Computability of the sequential flow problem 


Let Q be a finite set of states, Flows C N2*2 a downward closed set of flows and 
F C N2 a downward closed set of configurations, the sequential flow problem is 
to compute the downward closed set Pre* defined by 


Pre*(Flows, F) = {c E NS |cw le eF, fe Flows*} , 


i.e. the configurations from which one may reach F using only flows from Flows. 
The following classical result of [22] allows us to further reduce our problem. 


Lemma 6. The task of computing a downward closed set can be reduced to the 
task of deciding whether a given ideal is included in a downward closed set. 


Thanks to Lemma 6, it is sufficient for solving the sequential flow problem 
to establish the following result. 
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Lemma 7. Let I be an ideal of the form a} for a € (NU {w})2, and Flows C 
N22 be a downward closed set of flows. It is decidable whether F can be reached 
from all configurations of I using only flows from Flows. 


We call a vector a € (NU {w})2*2 a capacity. A capacity word is a finite 
sequence of capacities. For two capacity words w,w’ of the same length, we 
write w < w’ to mean that w; < w; for each i. Since flows are particular cases 
of capacities, we can compare flows with capacities in the same way. 

Before proving Lemma 7 let us give an example and some notations. 

Given a state q, we write q € NÈ for the vector which has value 1 on the q 
component and 0 elsewhere. More generally we let aq for a € NU {w} denote 
the vector with value œ on the q component and 0 elsewhere. We use similar 
notations for flows. For instance, wqı + gz has value w in the qı component, 1 in 
the q2 component, and 0 elsewhere. 

In the instance of the sequential flow problem represented in Figure 4, we ask 
the following question: can F be reached from any configuration of I = (wq2)? 
The answer is yes: the capacity word w = (ac”™ 1b)” is such that ngo ~ f nqa € F 
for a flow word f < w, the begining of which is described in Figure 5. 


e : ; 1 Og 36—_,; 9 a3 
4@—_, 0 en aa ue@—__0u 


Fig. 4. An instance of the sequential flow problem. We let Flows =a} Ub} U c} 
where a = w(q2, 92) + (g2,93) + w(q4, q4), b = w(qi, q2) + (q3, q4) + w(qa4, 94), and c = 
w(q1, q1) + (q2, 41) + w(q2, G2) + w(q3, 93) + w(G4, q4). Set also F = (waa). 


fy <c fa+ı <b 
Fig. 5. A flow word f = fife... fn4i S ac"~+b such that nqz goes to (n — 1)qı + q4 


using f. This construction can be extended to f < w such that nq2 goes to nqa using f. 


We write aļw + n] for the configuration obtained from a by replacing all ws 
by n. 
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The key idea for solving the sequential flow problem is to rephrase it using 
regular cost functions (a set of tools for solving boundedness questions). Indeed, 
whether F can be reached from all configurations of J = aJ using only flows 
from Flows can be equivalently phrased as a boundedness question, as follows: 


does there exist a bound on the values of n € N such that alw + nj] fe 
for some c E€ F and f € Flows*? 


We show that this boundedness question can be formulated as a boundedness 
question for a formula of cost monadic logic, a formalism that we introduce now. 
We assume that the reader is familiar with monadic second order logic (MSO) 
over finite words, and refer to [20] for the definitions. The syntax of cost monadic 
logic (cost MSO for short) extends MSO with the construct |X| < N, where X is 
a second order variable and N is a bounding variable. The semantics is defined 
as usual: w,n — ọ for a word w € A*, with n € N specifying the bound N. 
We assume that there is at most one bounding variable, and that the construct 
|X| < N appears positively, i.e. under an even number of negations. This ensures 
that the larger N, the more true the formula is: if w,n = p, then w,n’ = ọ 
for all n’ > n. The semantics of a formula y of cost MSO induces a function 
A* + NU {co} defined by y(w) = inf {n E N | w,n H p}. 

The boundedness problem for cost monadic logic is the following problem: 
given a cost MSO formula y over A*, is it true that the function A* — NU {oo} 
is bounded, i.e.: 


dn € N, Vw € A*, w,n = ọ? 


The decidability of the boundedness problem is a central result in the theory of 
regular cost functions ([5]). Since in the theory of regular cost functions, when 
considering functions we are only interested in whether they are bounded or 
not, we will consider functions “up to boundedness properties”. Concretely, this 
means that a cost function is an equivalence class of functions A* + NU {oo}, 
with the equivalence being f ~ g if there exists a : N — N such that f(w) is finite 
if and only if g(w) is finite, and in this case, f(w) < a(g(w)) and g(w) < a(f(w)). 
This is equivalent to stating that for all X C A%*, if f is bounded over X if and 
only if g is bounded over X. 
Let us now establish Lemma 7. 


Proof: Let T = {q € Q | alq) = w}. Note that for n sufficiently large, we have 
alw + n]4= IA {0,1,...,n}. We let @ C (NU {w})2*2 be the decomposition 
of Flows into ideals, that is, @ is the minimal finite set such that 


Flows = U b}. 


bee 


We let k denote the largest finite value that appears in the definition of @, that 
is, k = max{b(q,q') : b E C,q,q' € Q, b(q,q') Fv}. 
Let us define the function 
B: 6* — NU {w} 
w — sup{n E€ N : Jf < w,alw +} n] ~ f F}. 
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By definition ® is unbounded if and only if F can be reached from all configura- 
tions of J. Since boundedness of cost MSO is decidable, it suffices to construct 
a formula in cost monadic logic for ® to obtain the decidability of our problem. 
Our approach will be to additively decompose the capacity word w into a finitary 
part w®™ (which is handled using a regular language), and several unbounded 
parts w(°) for each s € T. The unbounded parts require a more careful analysis 
which notably goes through the use of the max-flow min-cut theorem. 

Note that a[w + n] decomposes as the sum of its finite part afn = alw + 0] 
and ` erns. Since flows are additive, it holds that f < w = w... w, is a 
flow from c, to F if and only if the capacity word w may be decomposed into 


(w)) ser = (wt Ne wer and w») = wii) es wir) such that 


all the numbers appearing in the wi” 


for all ¢ € {1,...,0},w: = Peruin w, 
— for all s € T, ns ~ f F for some flow word f < w*), 
— and afn ~» / F for some flow word f < w®™. 


capacities are bounded by k, 


In order to encode such capacity words in cost MSO we use monadic variables 


wi) p Where q,q' € Q, p € {0,...,k,w} and s € TU {fin}. They are meant to 


satisfy that i € wi ps if and if wi?) (q,q') = p. We use bold W to denote 


the tuple (W£, )aqrp.s, and W9) for we, Ja.q'p When s € TU {w} is fixed. 


4:4 :P q.0' Pp 
The MSO formula IsDecomp(W, w) states that a decomposition (w')) <erufw} 
is semantically valid and sums to w: 


: (s) 
Vi, | Agate Vneto,...,0} (i € we p ^ geet É Way q a 


; (s) 
^ Con wi(q, q') = p) = VGA cums A\serutsin} 1E Wig q! A 
> ps=p 
For s € T, we now consider the function 
ws) : ({0,1,...,k,w}2*2)" — NU {w} 
w) > sup{n E N| If < w, ns + F}. 


We also define YE») C ({0,...,k,w})°*® to be the language of capacity words 
wi") such that there exists a flow fx wn) with agn ~f F. Note that 
y(n) is a regular language since it is recognized by a finite automaton over 
{0,1,...,k|Q|}2 that may update the current bounded configuration only with 
flows smaller than the current letter of w“*™, 

We have 


(w) = sup aw, IsDecomp(W, w) A( A ws)(Wws)) > )>n) anwe» e plain). 
" sET 


Hence, it is sufficient to prove that for each s € T, W“*) is definable in cost MSO. 
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Let us fix s and a capacity word w € {0,...,k,w}2*2 of length |w| = £. 
Consider the finite graph G with vertex set Q x {0,1,..., 2} and for alli > 1, an 
edge from (q, i — 1) to (q’,i) labelled by w;(q,q’). Then W“)(w) is the maximal 
flow from (s,0) to (t, £) in G. We recall that a cut in a graph with distinguished 
source s and target t is a set of edges such that removing them disconnects s and 
t. The cost of a cut is the sum of the weight of its edges. The maz-flow min-cut 
theorem states that the maximal flow in a graph is exactly the minimal cost of 
a cut ([11]). 

We now define a cost MSO formula WS) which is equivalent (in terms of cost 
functions) to the minimal cost of cut in the previous graph G and thus to Y°). In 
the following formula, X = (Xq,q')q,q’eQ represents a cut in the graph: i € Xq,q' 
means that edge ((q,i— 1), (q’,2)) belongs to the cut. Likewise, P = (Py,q’)q,q'eQ 
represents paths in the graph. Let W*)(w) be defined by 


n 


inf fax[A T zail A (vii EX => wilad) < w) A Denix) 
qq’ 


where Disc, +¿(X,w) expresses that X disconnects (s,0) and (t,£) in G. For 
instance Discs (X, w) is defined by 


YP, K NiE Pag = wila, d) > 0) a (V0 € Pog) a (VEE Pas lA 
q q 


qq’ 


= Ji, V (i€ Xag ^i E Pag’). 
aq 


viz1, NiE Pag = ( Vi- 1E Pra) 


4 q 


uw 


Now W)(w) does not exactly define the minimal total weight 8“) (w) of a cut, 
but rather the minimal value over all cuts of the minimum over (q,q’) € Q? of 
how many edges are of the form ((q,i — 1), (q',i)). This is good enough for our 
purposes since these two values are related by 


EO (w) < B(w) < KRPE (w), 


implying that the functions ws) and ZC) define the same cost function. In par- 
ticular, 6) is definable in cost MSO. 


6 Conclusions 


We showed the decidability of the stochastic control problem. Our approach uses 
well quasi orders and the sequential flow problem, which is then solved using the 
theory of regular cost functions. 

Together with the original result of [3,4] in the adversarial setting, our result 
contributes to the theoretical foundations of parameterised control. We return to 
the first application of this model, control of biological systems. As we discussed 
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the stochastic setting is perhaps more satisfactory than the adversarial one, 
although as we saw very complicated behaviours emerge in the stochastic setting 
involving single agents, which are arguably not pertinent for modelling biological 
systems. 

We thus pose two open questions. The first is to settle the complexity status 
of the stochastic control problem. Very recently [18] proved the EXPTIME- 
hardness of the problem, which is interesting because the underlying phenomena 
involved in this hardness result are specific to the stochastic setting (and do not 
apply to the adversarial setting). Our algorithm does not even yield elementary 
upper bounds, leaving a very large complexity gap. The second question is to- 
wards more accurately modelling biological systems: can we refine the stochastic 
control problem by taking into account the synchronising time of the controller, 
and restrict it to reasonable bounds? 


Acknowledgements 


We thank Nathalie Bertrand and Blaise Genest for introducing us to this fasci- 
nating problem, and the preliminary discussions at the Simons Institute for the 
Theory of Computing in Fall 2015. 


References 


1. Abdulla, P.A., Henda, N.B., Mayr, R.: Decisive Markov chains. Logical Methods 
in Computer Science 3(4) (2007). https://doi.org/10.2168/LMCS-3(4:7) 2007 

2. Angluin, D., Aspnes, J., Diamadi, Z., Fischer, M.J., Peralta, R.: Computation in 
networks of passively mobile finite-state sensors. Distributed Computing 18(4), 
235-253 (2006). https://doi.org/10.1007/s00446-005-0138-3 

3. Bertrand, N., Dewaskar, M., Genest, B.,  Gimbert, H.: Con- 
trolling a population. In: CONCUR. pp. 12:1-12:16 (2017). 
https://doi.org/10.4230/LIPIcs. CONCUR.2017.12 

4. Bertrand, N., Dewaskar, M., Genest, B., Gimbert, H., Godbole, A.A.: Controlling 
a population. Logical Methods in Computer Science 15(3) (2019), https://Imcs. 
episciences.org/5647 

5. Colcombet, T.: Regular cost functions, part I: logic and algebra over words. Log- 
ical Methods in Computer Science 9(3) (2013). https://doi.org/10.2168/LMCS- 
9(3:3)2013 

6. Dickson, L.E.: Finiteness of the odd perfect and primitive abundant numbers with 
n distinct prime factors. American Journal of Mathematics 35(4), 413-422 (1913), 
http://www.jstor.org/stable/2370405 

7. Esparza, J.: Parameterized verification of crowds of anonymous processes. 
In: Dependable Software Systems Engineering, pp. 59-71. IOS Press (2016). 
https: //doi.org/10.3233/978-1-61499-627-9-59 

8. Esparza, J., Finkel, A., Mayr, R.: On the verification of broadcast protocols. In: 
LICS. pp. 352-359 (1999). https: //doi.org/10.1109/LICS.1999.782630 

9. Fijalkow, N.: Undecidability results for probabilistic automata. SIGLOG News 
4(4), 10-17 (2017), https://dl.acm.org/citation.cfm?id=3157833 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


Controlling a random population 135 


Fijalkow, N., Gimbert, H., Horn, F., Oualhadj, Y.: Two recursively insep- 
arable problems for probabilistic automata. In: MFCS. pp. 267-278 (2014). 
https://doi.org/10.1007/978-3-662-44522-8_23 

Ford, L.R., Fulkerson, D.R.: Maximal flow through a network. Canadian Journal 
of Mathematics 8, 399-404 (1956). https://doi.org/10.4153/CJM-1956-045-5 
German, S.M., Sistla, A.P.: Reasoning about systems with many processes. Journal 
of the ACM 39(3), 675-735 (1992) 

Gimbert, H., Oualhadj, Y.: Probabilistic automata on finite words: De- 
cidable and undecidable problems. In: ICALP. pp. 527-538 (2010). 
https: //doi.org/10.1007/978-3-642-14162-1_44 

Gradel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games, 
LNCS, vol. 2500. Springer (2002) 

Higman, G.: Ordering by _ divisibility in abstract algebras. Proceed- 
ings of the London Mathematical Society s3-2(1), 326-336 (1952). 
https://doi.org/10.1112/plms/s3-2.1.326 

Kruskal, J.B.: The theory of well-quasi-ordering: A frequently discovered concept. 
J. Comb. Theory, Ser. A 13(3), 297-305 (1972). https://doi.org/10.1016/0097- 
3165(72)90063-5 

Kuéera, A.: Turn-Based Stochastic Games. Lectures in Game Theory for Computer 
Scientists, Cambridge University Press (2011) 

Mascle, C., Shirmohammadi, M., Totzke, P.: Controlling a random population is 
EXPTIME-hard. CoRR (2019), http://arxiv.org/abs/1909.06420 

Schmitz, S.: Algorithmic Complexity of Well-Quasi-Orders. Habilitation à diriger 
des recherches, Ecole normale supérieure Paris-Saclay (Nov 2017), https://tel. 
archives-ouvertes.fr/tel-01663266 

Thomas, W.: Languages, automata, and logic. In: Handbook of Formal Language 
Theory, vol. III, pp. 389-455. Springer (1997) 

Uhlendorf, J., Miermont, A., Delaveau, T., Charvin, G., Fages, F., Bottani, S., 
Hersen, P., Batt, G.: In silico control of biomolecular processes. Computational 
Methods in Synthetic Biology 13, 277-285 (2015) 

Valk, R., Jantzen, M.: The residue of vector sets with applications to de- 
cidability problems in Petri nets. Acta Informatica 21, 643-674 (03 1985). 
https: //doi.org/10.1007/BF00289715 


Open Access This chapter is licensed under the terms of the Creative Commons 


Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), 


which permits use, sharing, adaptation, distribution and reproduction in any medium 


or format, as long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if changes were 


made. 


The images or other third party material in this chapter are included in the chapter’s 


Creative Commons license, unless indicated otherwise in a credit line to the material. If 


material is not included in the chapter’s Creative Commons license and your intended 


use is not permitted by statutory regulation or exceeds the permitted use, you will need 


to obtain permission directly from the copyright holder. 


® 


Check for 
updates 


Decomposing Probabilistic Lambda-Calculi 


Ugo Dal Lago!®, Giulio Guerrieri?“@, and Willem Heijltjes? 


1 Dipartimento di Informatica - Scienza e Ingegneria 
Universita di Bologna, Bologna, Italy 
ugo.dallago@unibo.it 


2 Department of Computer Science 
University of Bath, Bath, UK 
w.b.heijltjes,g.guerrieri ath.ac.u 

b.heijltjes,g.guerrieri}@bath k 


Abstract. A notion of probabilistic lambda-calculus usually comes with 
a prescribed reduction strategy, typically call-by-name or call-by-value, 
as the calculus is non-confluent and these strategies yield different results. 
This is a break with one of the main advantages of lambda-calculus: 
confluence, which means that results are independent from the choice 
of strategy. We present a probabilistic lambda-calculus where the proba- 
bilistic operator is decomposed into two syntactic constructs: a generator, 
which represents a probabilistic event; and a consumer, which acts on 
the term depending on a given event. The resulting calculus, the Prob- 
abilistic Event Lambda-Calculus, is confluent, and interprets the call- 
by-name and call-by-value strategies through different interpretations of 
the probabilistic operator into our generator and consumer constructs. 
We present two notions of reduction, one via fine-grained local rewrite 
steps, and one by generation and consumption of probabilistic events. 
Simple types for the calculus are essentially standard, and they convey 
strong normalization. We demonstrate how we can encode call-by-name 
and call-by-value probabilistic evaluation. 


1 Introduction 


Probabilistic lambda-calculi [24,22,17,11,18,9,15] extend the standard lambda- 
calculus with a probabilistic choice operator N®,M, which chooses N with 
probability p and M with probability 1 — p (throughout this paper, we let p be 
1/2 and will omit it). Duplication of N® M, as is wont to happen in lambda- 
calculus, raises a fundamental question about its semantics: do the duplicate 
occurrences represent the same probabilistic event, or different ones with the 
same probability? For example, take the term To 1 that represents a coin flip 
between boolean values true T and false L. If we duplicate this term, do the 
copies represent two distinct coin flips with possibly distinct outcomes, or do 
these represent a single coin flip that determines the outcome for both copies? 
Put differently again, when we duplicate T® L, do we duplicate the event, or 
only its outcome? 

In probabilistic lambda-calculus, these two interpretations are captured by 
the evaluation strategies of call-by-name (—<bn), which duplicates events, and 
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call-by-value (—.by), which evaluates any probabilistic choice before it is du- 
plicated, and thus only duplicates outcomes. Consider the following example, 
where = tests equality of boolean values. 


T we (Av.c=2)(TOL) or Tol 


This situation is not ideal, for several, related reasons. Firstly, it demonstrates 
how probabilistic lambda-calculus is non-confluent, negating one of the central 
properties of the lambda-calculus, and one of the main reasons why it is the 
prominent model of computation that it is. Secondly, it means that a probabilis- 
tic lambda-calculus must derive its semantics from a prescribed reduction strat- 
egy, and its terms only have meaning in the context of that strategy. Thirdly, 
combining different kinds of probabilities becomes highly involved [15], as it 
would require specialized reduction strategies. These issues present themselves 
even in a more general setting, namely that of commutative (algebraic) effects, 
which in general do not commute with copying. 

We address these issues by a decomposition of the probabilistic operator into 
a generator [a] and a choice &, as follows. 


NeM ĉ [u]. NőM 


Semantically, [a] represents a probabilistic event, that generates a boolean value 
recorded as a. The choice N&M is simply a conditional on a, choosing N if a is 
false and M if a is true. Syntactically, a is a boolean variable with an occurrence 
in &, and [a] acts as a probabilistic quantifier, binding all occurrences in its 
scope. (To capture a non-equal chance, one would attach a probability p to a 
generator, as [a],, though we will not do so in this paper.) 

The resulting probabilistic event lambda-calculus Ape, which we present in 
this paper, is confluent. Our decomposition allows us to separate duplicating 
an event, represented by the generator [a], from duplicating only its outcome 
a, through having multiple choice operators &. In this way our calculus may 
interpret both original strategies, call-by-name and call-by-value, by different 
translations of standard probabilistic terms into Ape: call-by-name by the above 
decomposition (see also Section 2), and call-by-value by a different one (see Sec- 
tion 7). For our initial example, we get the following translations and reductions. 


cbn: (Ae.e=2)(2}TSL) >s (@}TS1L)=(TSL) » TeL (1) 
cbv: [a] (Az.z=2)(T8L) 5 [eh (TE1L)=(TSL) +» T (3 


We present two reduction relations for our probabilistic constructs, both in- 
dependent of beta-reduction. Our main focus will be on permutative reduction 
(Sections 2, 3), a small-step local rewrite relation which is computationally ineffi- 
cient but gives a natural and very fine-grained operational semantics. Projective 
reduction (Section 6) is a more standard reduction, following the intuition that 
[a] generates a coin flip to evaluate &, and is coarser but more efficient. 

We further prove confluence (Section 4), and we give a system of simple 
types and prove strong normalization for typed terms by reducibility (Section 5). 
Omitted proofs can be found in [7], the long version of this paper. 
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1.1 Related Work 


Probabilistic -calculi are a topic of study since the pioneering work by Saheb- 
Djaromi [24], the first to give the syntax and operational semantics of a A-calculus 
with binary probabilistic choice. Giving well-behaved denotational models for 
probabilistic A-calculi has proved to be challenging, as witnessed by the many 
contributions spanning the last thirty years: from Jones and Plotkin’s early study 
of the probabilistic powerdomain [17], to Jung and Tix’s remarkable (and mostly 
negative) observations [18], to the very recent encouraging results by Goubault- 
Larrecq [16]. A particularly well-behaved model for probabilistic \-calculus can 
be obtained by taking a probabilistic variation of Girard’s coherent spaces [10], 
this way getting full abstraction [13]. 

On the operational side, one could mention a study about the various ways 
the operational semantics of a calculus with binary probabilistic choice can be 
specified, namely by small-step or big-step semantics, or by inductively or coin- 
ductively defined sets of rules [9]. Termination and complexity analysis of higher- 
order probabilistic programs seen as \-terms have been studied by way of type 
systems in a series of recent results about size [6], intersection [4], and refinement 
type disciplines [1]. Contextual equivalence on probabilistic \-calculi has been 
studied, and compared with equational theories induced by Bohm Trees [19], 
applicative bisimilarity [8], or environmental bisimilarity [25]. 

In all the aforementioned works, probabilistic \-calculi have been taken as 
implicitly endowed with either call-by-name or call-by-value strategies, for the 
reasons outlined above. There are only a few exceptions, namely some works on 
Geometry of Interaction [5], Probabilistic Coherent Spaces [14], and Standard- 
ization [15], which achieve, in different contexts, a certain degree of indepen- 
dence from the underlying strategy, thus accommodating both call-by-name and 
call-by-value evaluation. The way this is achieved, however, invariably relies on 
Linear Logic or related concepts. This is deeply different from what we do here. 

Some words of comparison with Faggian and Ronchi Della Rocca’s work 
on confluence and standardization [15] are also in order. The main difference 
between their approach and the one we pursue here is that the operator ! in 
their calculus Al, plays both the roles of a marker for duplicability and of a 
checkpoint for any probabilistic choice ” flowing out” of the term (i.e. being 
fired). In our calculus, we do not control duplication, but we definitely make use 
of checkpoints. Saying it another way, Faggian and Ronchi Della Rocca’s work 
is inspired by linear logic, while our approach is inspired by deep inference, even 
though this is, on purpose, not evident in the design of our calculus. 

Probabilistic -calculi can also be seen as vehicles for expressing probabilistic 
models in the sense of bayesian programming [23,3]. This, however, requires an 
operator for modeling conditioning, which complicates the metatheory consid- 
erably, and that we do not consider here. 

Our permutative reduction is a refinement of that for the call-by-name prob- 
abilistic A-calculus [20], and is an implementation of the equational theory of 
(ordered) binary decision trees via rewriting [27]. Probabilistic decision trees 
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have been proposed with a primitive binary probabilistic operator [22], but not 
with a decomposition as we explore here. 


2 The Probabilistic Event A-Calculus Apg 


Definition 1. The probabilistic event \-calculus (Ape) is given by the follow- 
ing grammar, with from left to right: a variable (denoted by x,y, z,...), an 
abstraction, an application, a (labeled) choice, and a (probabilistic) generator. 


M,N «= « | »4*.N | NM | N&M | LE]. N 


In a term Az. M the abstraction Ax binds the free occurrences of the variable 
x in its scope M, and in [a]. N the generator [a] binds the label a in N. The 
calculus features a decomposition of the usual probabilistic sum ©, as follows. 


NeM ĉ [u]. NŐM (3) 
The generator [a] represents a probabilistic event, whose outcome, a binary value 
{0,1} represented by the label a, is used by the choice operator &. That is, [a] 
flips a coin setting a to 0 (resp. 1), and depending on this N&M reduces to N 
(resp. M). We will use the unlabeled choice © as in (3). This convention also 
gives the translation from a call-by-name probabilistic \-calculus into Ape (the 
interpretation of a call-by-value probabilistic \-calculus is in Section 7). 


Reduction. Reduction in Apg will consist of standard 6-reduction —rg plus an 
evaluation mechanism for generators and choice operators, which implements 
probabilistic choice. We will present two such mechanisms: projective reduc- 
tion >, and permutative reduction +,. While projective reduction implements 
the given intuition for the generator and choice operator, we relegate it to Sec- 
tion 6 and make permutative reduction our main evaluation mechanism, for the 
reason that it is more fine-grained, and thus more general. 

Permutative reduction is based on the idea that any operator distributes 
over the labeled choice operator (see the reduction steps in Figure 1), even other 
choice operators, as below. 


(N8M)8P ~ (N&P)$(M&P) 


To orient this as a rewrite rule, we need to give priority to one label over another. 
Fortunately, the relative position of the associated generators [a] and [b] provides 
just that. Then to define —p, we will want every choice to belong to some 
generator, and make the order of generators explicit. 


Definition 2. The set fl(V) of free labels of a term N is defined inductively by: 
fl(x) = 0 fI(MN) = fI(M) U fI(N) fI(Ax. M) = fI(M) 

fi((a]. M) = fI(M) ~ {a} fI(M8N) = fI(M) U fI(N) U {a} 

A term M is label-closed if fl(M) = 0. 
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(Ar.N)M 5 N[M/z] (8) 
NéN >, N (i) 
(NEM)&EP >, NSP (c1) 
N6&é(M&P) >, NSP (c2) 
Ax. (NGM) —p (Az. N) & (àx. M) (@A) 
(N&M)P —>» (NP)S(MP) (ef) 
N(M&P) —, (NM) 8(NP) (ea) 
(N&M)&P —, (N&P)8(M&SP) if a <b) (@61) 
N&(M&P) —» (N€M)8(N6P) if a <b) (@62) 
L] (NM) > (b}.N)s(e] /) if a # b) (en) 
ma] N —p N if a ¢ fI(N)) (A) 

Az. [a]. N > [a]. Ax. N (DA) 
E]. N)M — [a]. (NM) (if a ¢ fI(M)) (of) 


Fig. 1. Reduction Rules for -reduction and p-reduction. 


From here on, we consider only label-closed terms (we implicitly assume this, 
unless otherwise stated). All terms are identified up to renaming of their bound 
variables and labels. Given some terms M and N and a variable x, M[N/z] is 
the capture-avoiding (for both variables and labels) substitution of N for the free 
occurrences of x in M. We speak of a representative M of a term when M is not 
considered up to such a renaming. A representative M of a term is well-labeled 
if for every occurrence of [a] in M there is no [a] occurring in its scope. 


Definition 3 (Order for labels). Let M be a well-labeled representative of a 
term. We define an order <m for the labels occurring in M as follows: a <m b 
if and only if [b] occurs in the scope of [a]. 


For a well-labeled and label-closed representative M, <m is a finite tree order. 


Definition 4. Reduction —>=—g U —p in Ape consists of B-reduction —g 
and permutative or p-reduction —,, both defined as the contextual closure of 
the rules given in Figure 1. We write — for the reflexive-transitive closure of 
—, and +» for reduction to normal form; similarly for +g and +,. We write =p 
for the symmetric and reflexive-transitive closure of —p. 
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[a]. (Av.c=z)(TEL) —p [e] (Az.z=2z)T & (Ax. x=x£)L (@a) 
rg [a]. (T=T) 6 (=) 
= ETT = EIT = T Go 


Fig. 2. Example Reduction of the cbv-translation of the Term on p. 137. 


Two example reductions are (1)-(2) on p. 137; a third, complete reduction is in 
Figure 2. The crucial feature of p-reduction is that a choice & does permute out 
of the argument position of an application, but a generator [a] does not, as below. 
Since the argument of a redex may be duplicated, this is how we characterize the 
difference between the outcome of a probabilistic event, whose duplicates may be 
identified, and the event itself, whose duplicates may yield different outcomes. 


N(M6P) =, (NM)6(NP) N (J.M) A, [a]. NM 


By inspection of the rewrite rules in Figure 1, we can then characterize the 
normal forms of +, and — as follows. 


Proposition 5 (Normal forms). The normal forms Pọ of —>p, respectively 
No of >, are characterized by the following grammars. 


No == Ny | Noe Nb 
Ny oS No | Ax.Nı 
No 3S T | Nə No 


3 Properties of Permutative Reduction 


We will prove strong normalization and confluence of —p. For strong normal- 
ization, the obstacle is the interaction between different choice operators, which 
may duplicate each other, creating super-exponential growth.” Fortunately, Der- 
showitz’s recursive path orders [12] seem tailor-made for our situation. 
Observe that the set Ape endowed with —y is a first-order term rewriting sys- 

tem over a countably infinite set of variables and the signature X given by: 

e the binary function symbol &, for any label a; 

e the unary function symbol [a], for any label a; 

e the unary function symbol Ax, for any variable z; 

e the binary function symbol @, letting @(M, N) stand for MN. 


3 This was inferred only from a simple simulation; we would be interested to know a 
rigorous complexity result. 
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Definition 6. Let M be a well-labeled representative of a label-closed term, 
and let Xm be the set of signature symbols occurring in M. We define <m as 
the (strict) partial order on Xm generated by the following rules. 


a b k 

@ <m ® ifa <m b 

& xm [e] for any labels a,b 
<m @, àx for any label b 


Lemma 7. The reduction +, is strongly normalizing. 


Proof. For the first-order term rewriting system (Apg, —>p) we derive a well- 
founded recursive path ordering < from <m following |12, p. 289]. Let f and g 
range over function symbols, let [Ni,...,N,] denote a multiset and extend < 
to multisets by the standard multiset ordering, and let N = f(Nı,..., Nn) and 
M = g(Mı,..., Mm); then 


Niso., Na] < [Mi ..., Mm] Ef =g 
N< M <> 4 [Ns Nn] < [M] TE 
[N] < [M1, ..-, Mm] if f fm g : 


While <m is defined only relative to Xm, reduction may only reduce the signa- 
ture. Inspection of Figure 1 then shows that M —, N implies N < M. 


Confluence of Permutative Reduction. With strong normalization, conflu- 
ence of —p requires only local confluence. We reduce the number of cases to 
consider, by casting the permutations of & as instances of a common shape. 


Definition 8. We define a context C|] (with exactly one hole []) as follows, and 
let C[N] represent C[] with the hole [] replaced by N. 


Cl} == [] | Az.C[] | CM | NCI] | Clem | NéCT] | [e}-Cl] 


Observe that the six reduction rules 6A through 0 in Figure 1 are all of the 
following form. We refer to these collectively as ©x. 


CINS M] >, C[N]éC[M (@x) 
Lemma 9 (Confluence of —,). Reduction —, is confluent. 
Proof. By Newman’s lemma and strong normalization of —p (Lemma 7), con- 


fluence follows from local confluence. The proof of local confluence consists of 
joining all critical pairs given by —p. Details are in the Appendix of [7]. 


Definition 10. We denote the unique p-normal form of a term N by Np. 
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4 Confluence 


We aim to prove that +=—>g U —p is confluent. We will use the standard 
technique of parallel G-reduction [26], a simultaneous reduction step on a number 
of B-redexes, which we define via a labeling of the redexes to be reduced. The 
central point is to find a notion of reduction that is diamond, i.e. every critical 
pair can be closed in one (or zero) steps. This will be our complete reduction, 
which consists of parallel G-reduction followed by p-reduction to normal form. 


Definition 11. A labeled term P® is a term P with chosen $-redexes annotated 
as (Az. N)° M. The unique labeled 8-step P® =} P, from P® to the labeled reduct 
P, reduces every labeled redex, and is defined inductively as follows. 


Az. N°)®M° >, N.(M./x N° M°” =g NM. 
B B 
rg T N° &M° = NOM 
At. N® =g àx. Ne [a]. N? =>, [a]. Ne 


A parallel B-step P =} P, is a labeled step P® =} Pe for some labeling P®. 


Note that P, is an unlabeled term, since all labels are removed in the reduction. 
For the empty labeling, P® = P, = P, so parallel reduction is reflexive: P =», P. 


Lemma 12. A parallel B-step P =} P. is a B-reduction P +»g Pe. 


Proof. By induction on the labeled term P® generating P = Pe. 
Lemma 13. Parallel G-reduction is diamond. 


Proof. Let P® =g Pe and P° = P, be two labeled reduction steps on a term 
P. We annotate each step with the label of the other, preserved by reduction, 
to give the span from the doubly labeled term P*° = P°® below left. Reducing 
the remaining labels will close the diagram, as below right. 


Ps B4 pe? pest >B PS P = Pee = EPa Be PS 
This is proved by induction on P®°, where only two cases are not immediate: 
those where a redex carries one but not the other label. One case follows by 
the below diagram; the other case is symmetric. Below, for the step top right, 
induction on N° shows that N°[M°/z] +, N.[M./z]. 


(Ax. N°°)°M°® = N3[M3/z] = 6 Noe[Moe/x] 


(Ax. N*°)°M*? =a (Ax. N°)? Mo =s Neo[Meo/2] 
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4.1 Parallel Reduction and Permutative Reduction 


For the commutation of (parallel) 6-reduction with p-reduction, we run into the 
minor issue that a permuting generator or choice operator may block a redex: in 
both cases below, before —p the term has a redex, but after —p it is blocked. 


(At. NSM) P =p ((Ax.N)&(Az.M))P (Ax.[a]. N) M >, (a). Az. N) M 
We address this by an adaptation — of p-reduction on labeled terms, which is 


a strategy in +», that permutes past a labeled redex in one step. 


Definition 14. A labeled p-reduction N° —p M° on labeled terms is a p- 
reduction of one of the forms 


(Az. N° M°) P? >p (Ax. N°)? P? 8 (Ax. M°)? P’ 
(Ax. [a]. N°) M° >p [a]. (Ax. N°)? M° 
or a single p-step —p on unlabeled constructors in N°. 
Lemma 15. Reduction to normal form in —p is equal to =p (on labeled terms). 


Proof. Observe that —p and —p have the same normal forms. Then in one 
direction, since —>p C p we have >p C >p. Conversely, let N >p M. On this 
reduction, let P —p Q be the first step such that P “4, Q. Then there is an R 
such that P =>, R and Q +, R. Note that we have N >p R. By confluence, 
R p M, and by induction on the sum length of paths in —p from R (smaller 
than from N) we have R >», M, and hence N +», M. 


The following lemmata then give the required commutation properties of the 
relations ++), >p, and =g. Figure 3 illustrates these by commuting diagrams. 


Lemma 16. If N° 4, M° then Ne =p Me. 


Proof. By induction on the rewrite step —p. The two interesting cases are: 


(Ax. M*)*(N* 3 P®) Ag ((Ax. M*)*N*) 8 ((Ax. M°)° P°®) 
el 18 (x € fv(M)) 


e| 18 (e ¢ N(M) 
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How the critical pairs in the above diagrams are joined shows that we cannot 
use the Hindley-Rosen Lemma [2, Prop. 3.3.5] to prove confluence of +g U —». 


Lemma 17. Ne =p Npe- 


Proof. Using Lemma 15 we decompose N° -p Np as 


N’ = N? >p N? p++: 9p Nn = NE 


where (Ni)e =p (Ni+1)e by Lemma 16. 


4.2 Complete Reduction 


To obtain a reduction strategy with the diamond property for —, we combine 
parallel reduction =g with permutative reduction to normal form +», into a no- 
tion of complete reduction =. We will show that it is diamond (Lemma 19), and 
that any step in — maps onto a complete step of p-normal forms (Lemma 20). 
Confluence of — (Theorem 21) then follows: any two paths -» map onto complete 
paths = on p-normal forms, which then converge by the diamond property. 


Definition 18. A complete reduction step N = Nep is a parallel 6-step fol- 
lowed by p-reduction to normal form: 


N > Nep : N =g Ne >p Nep - 


Lemma 19 (Complete reduction is diamond). If P = N = M then for 
some Q, P = Q = M. 


Proof. By the following diagram, where M = Nop and P = Nep, and Q = Noep. 
The square top left is by Lemma 13, top right and bottom left are by Lemma 17, 
and bottom right is by confluence and strong normalization of p-reduction. 


J o | 
B = 
NS => Noe “Pp Nope 
[o> e 
= B E x 
Nes —? Nepo >bl Noep 


Lemma 20 (p-Normalization maps reduction to complete reduction). 
If N— M then Np = Mp. 


Proof. For a p-step N —, M we have N, = Mp while =, is reflexive. For a 
B-step N —g M we label the reduced redex in N to get N° =s Ne = M. Then 
Lemma 17 gives Npe =p M, and hence Np =>¢ Npe >p Mp. 
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N—+M N a M N= 3M N—> M 
‘| eodd b | | 4 p 
P = Q P =p Q P == Q P = Q 
Lemma 16 Lemma 17 Lemma 19 Lemma 20 


Fig. 3. Diagrams for the Lemmata Leading up to Confluence 


Theorem 21. Reduction — is confluent. 


Proof. By the following diagram. For the top and left areas, by Lemma 20 any 
reduction path N =» M maps onto one Np = Mp. The main square follows by 
the diamond property of complete reduction, Lemma 19. 


——*M 
y Ny 


N 
| N, — M, 
P 


| 


PẸ 


= 


mo} 


5 Strong Normalization for Simply-Typed Terms 


In this section, we prove that the relation — enjoys strong normalization in 
simply typed terms. Our proof of strong normalization is based on the classic 
reducibility technique, and inherently has to deal with label-open terms. It thus 
make great sense to turn the order <m from Definition 3 into something more 
formal, at the same time allowing terms to be label-open. This is in Figure 4. 
It is easy to realize that, of course modulo label a-equivalence, for every term 
M there is at least one 0 such that 0 Fz M. An easy fact to check is that if 
Ot, M and M — N, then 0 Fz N. It thus makes sense to parametrize — on 
a sequence of labels 0, i.e., one can define a family of reduction relations =° on 
pairs in the form (M, 6). The set of strongly normalizable terms, and the number 
of steps to normal forms become themselves parametric: 
e The set SN’ of those terms M such that 0 F; M and (M,@) is strongly 
normalizing modulo —°; 
e The function sn? assigning to any term in SN® the maximal number of >? 
steps to normal form. 
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Label Sequences: 0 == ceļa- 
Label Judgments: E = 0FLM 
0FL M a-0Fr M 
Orr « Orr Ax.M Ob .M 
Label Rules: j ss j [el 
Or M 0FLN 0FL M OFLN acé 
0H MN OHL MEN 
Fig. 4. Labeling Terms 
Types: T = a|Tr>p 
Environments: D -u= 4 Sis sha Oi i 
Judgments: won FREMT 
Tait M: p TEM:r 
Ege ae TERAM T Fla) M: 
Typing Rules: E 7 TA [al ʻ 
TEM: T>p IFN: TEM:r rEN:7 
r-MN:p r-M8&N:T 
Fig. 5. Types, Environments, Judgments, and Rules 
L€ SN? .-- Lm€SN® MLi...Lm E€ SN? NLi...Lm ESN? a€8 
zL... Lm € SN? MSNLi... Lm € SN? 
M[Lo/zx]L1...Lm E€ SN? Lo € SN® M1I1...Lm€SN°? Viag Li 
(Ax. M) Lo... Lm € SN? (e]. M)L1...Lm E SN? 


Fig. 6. Closure Rules for Sets SN° 


We can now define types, environments, judgments, and typing rules in Figure 5. 


Please notice that the type structure is precisely the one of the usual, vanilla, 
simply-typed A-calculus (although terms are of course different), and we can thus 
reuse most of the usual proof of strong normalization, for example in the version 
given by Ralph Loader’s notes [21], page 17. 


Lemma 22. The closure rules in Figure 6 are all sound. 
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Since the structure of the type system is the one of plain, simple types, the 
definition of reducibility sets is the classic one: 


Reda = {(T,0,M) | M € SN? ATEM : a}; 
Red-s,={(I,0,M)| (TEM :T=>p)^(0Fr M) A 
Y(TA,90, N) € Red,.(TA,0, MN) € Redp}. 


Before proving that all terms are reducible, we need some auxiliary results. 


Lemma 23. 1. If (I,0,M) € Red,, then M € SN®. 
2. If H ely...Lm:7 and Ly,...,Lm E€ SN?, then (T,0, £L; ... Lm) € Redz. 
3. If (T,0, M|Lo/x]L1 ... Lm) € Red, with T + Lo : p and Lo € SN®, then 
(T, 0, (Xx. M)Lo...Lm) € Redz. 
4. If (T,0, M Lı... Lm) € Red, with (T,0, NL; ... Lm) € Red, anda € 9, then 
(T,0,(M8N)Li... Lm) € Redz. 
5. If (T,a-0, ML... Lm) € Red, anda ¢ L; for alli, 
then (I,9, ([a]. M)L1...Lm) € Redz. 


Proof. The proof is an induction on 7: If 7 is an atom a, then Point 1 follows 
by definition, while points 2 to 5 come from Lemma 22. If T is p > u, Points 2 
to 5 come directly from the induction hypothesis, while Point 1 can be proved 
by observing that M is in SN ° if Mz is itself SN°, where x is a fresh variable. 
By induction hypothesis (on Point 2), we can say that (T(x : p),0,7) € Redp, 
and conclude that (I(x: p),0, Mx) € Red,,. 


The following is the so-called Main Lemma: 


Proposition 24. Suppose Yı : %,---;Yn : m F M : p and0 Fr M, with 
(T,0, Nj) € Red,, for alll <j <n. Then (I,0,M[Ni/y1,---,Nn/yn]) € Redp. 


Proof. This is an induction on the structure of the term M: 
e If M isa variable, necessarily one among y1,...,Yn, then the result is trivial. 
e If M is an application LP, then there exists a type € such that y1 :71,..-,Yn: 
mE L: E= pand y: Ti,...,Yn : Tn FP: €. Moreover, 0 Fz Land 0 Fz P 
we can then safely apply the induction hypothesis and conclude that 


(1,0, L[N/9]) € Redes  (T,0, P[N/J]) € Redg . 
By definition, we get 
(T, 0, (LP)[N/g]) € Red, . 


e If M is an abstraction Ax.L, then p is an arrow type € => pu and yı : 
Tiy+++3Yn : Tr, © : EFH L: u. Now, consider any ('A,0,P) € Rede. Our 
objective is to prove with this hypothesis that (T'A, 8,(Ax.L|N /g])P) € 
Red,,. By induction hypothesis, since (TA, N;) € Red,,, we get that 
(LA, 6, L|N/ŅJ, P/x]) € Red,,. The thesis follows from Lemma 23. 
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e If M is a sum L&P, we can make use of Lemma 23 and the induction 
hypothesis, and conclude. 

e If M is a generator [a]. P, we can make use of Lemma 23 and the induction 
hypothesis. We should however observe that a- 9, P, since 0 Fz M. 


We now have all the ingredients for our proof of strong normalization: 
Theorem 25. If M:7 and0}; M, then M € SN°®. 


Proof. Suppose that xı : p1,..., 2n : Pn M : T. Since z1 : p1,.--,%n: Pn F Ti : 
pi for all i, and clearly 0 Fz x; for every i, we can apply Lemma 24 and obtain 
that (T, 0, M[T/T]) € Red, from which, via Lemma 23, one gets the thesis. 


6 Projective Reduction 


Permutative reduction —ņ evaluates probabilistic sums purely by rewriting. Here 
we look at a more standard projective notion of reduction, which conforms more 
closely to the intuition that [a] generates a probabilistic event to determine the 
choice &. Using + for an external probabilistic sum, we expect to reduce [a]. N to 
No+N, where each N; is obtained from N by projecting every subterm Mo 8 Mı 
to Mi. The question is, in what context should we admit this reduction? We first 
limit ourselves to reducing in head position. 


Definition 26. The a-projections 7§(N) and m{(NV) are defined as follows: 


mg(N&M) = 76(N) m(Az. N) = Axe (N) 

pr =q] (M) mi (NM) = mi (N) on 

re (E.N) = [a]. N m(N&M)=n°(N)&n%(M)  ifa#b 
Pa my (l N) = [e]. rf (N) if a A b. 


Definition 27. A head context H|] is given by the following grammar. 
Al]==[] | Av. H[] | ALIN 
Definition 28. Projective head reduction zh is given by 
H{[a].N] an H[rg(N)] + H[ri(N)]. 


We can simulate —,, by permutative reduction if we interpret the external 
sum + by an outermost © (taking special care if the label does not occur). 
Proposition 29. Permutative reduction simulates projective head reduction: 
HIN] if a € fI(N) 

Alr mi (N)] otherwise. 


A[[a].N] > 


z 
S 
È 
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Proof. The case a ¢ fl(V) is immediate by a A step. For the other case, observe 
that H[[a]. N] >p [a]. H[N] by OA and Of steps, and since a does not occur in 
H|], that H[r?(N)] = 7?(H[N]). By induction on N, if a is minimal in N (i.e. 
a € fl(N) and a < b for all b € fl(N)) then N +», 7@(N)&n¢(N). As required, 


H|]. N] >p [a]. H[re(N)] S A[r2(N)| if ae fi(N). 


A gap remains between which generators will not be duplicated, which we 
should be able to reduce, and which generators projective head reduction does 
reduce. In particular, to interpret call-by-value probabilistic reduction in Sec- 
tion 7, we would like to reduce under other generators. However, permutative 
reduction does not permit exchanging generators, and so only simulates reducing 
in head position. While (independent) probabilistic events are generally consid- 
ered interchangeable, it is a question whether the below equivalence is desirable. 


fl ELN ~ Biel (4) 


We elide the issue by externalizing probabilistic events, and reducing with refer- 
ence to a predetermined binary stream s € {0,1}‘ representing their outcomes. 
In this way, we will preserve the intuitions of both permutative and projective 
reduction: we obtain a qualified version of the equivalence (4) (see (5) below), 
and will be able to reduce any generator on the spine of a term: under (other) 
generators and choices as well as under abstractions and in function position. 


Definition 30. The set of streams is S = {0,1}, ranged over by r, s,t, and i-s 
denotes a stream with 7 € {0,1} as first element and s as the remainder. 


Definition 31. The stream labeling N* of aterm N with a stream s € S, which 
annotates generators as [a] with i € {0,1} and variables as xê with a stream 
s, is given inductively below. We lift @-reduction to stream-labeled terms by 
introducing a substitution case for stream-labeled variables: x«*[M/a] = M*. 


(Ax. N)* = àx. N° ([a]. N)** = [a]. N" 
(NM) = N° M (N&M): = N°oMS 


Definition 32. Projective reduction —, on stream-labeled terms is the rewrite 
relation given by 
[a]’.N —r m3(N). 


Observe that in N* a generator that occurs under n other generators on the 
spine of N, is labeled with the element of s at position n + 1. Generators in 
argument position remain unlabeled, until a 6-step places them on the spine, 
in which case they become labeled by the new substitution case. We allow to 
annotate a term with a finite prefix of a stream, e.g. N’ with a singleton i, so that 
only part of the spine is labeled. Subsequent labeling of a partly labeled term is 
then by (N")* = N'® (abusing notation). To introduce streams via the external 
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probabilistic sum, and to ignore an unused remaining stream after completing a 
probabilistic computation, we adopt the following equation. 

N=N°+N! 
Proposition 33. Projective reduction generalizes projective head reduction: 


H{[e].N] = Alfe]®.N]+ Al[e]'.N] 3. A[xG(N)] + Hrt (N). 


Returning to the interchangeability of probabilistic events, we refine (4) by 
exchanging the corresponding elements of the annotating streams: 


(ELEN)? = iN to eE) 
~ = (5) 
EEN = iN to wb (nec) 


Stream-labeling externalizes all probabilities, making reduction determinis- 
tic. This is expressed by the following proposition, that stream-labeling com- 
mutes with reduction: if a generator remains unlabeled in M and becomes la- 
beled after a reduction step M — N, what label it receives is predetermined. 
The deep reason is that stream labeling assigns an outcome to each generator in 
a way that corresponds to a call-by-name strategy for probabilistic reduction. 


Proposition 34. If M— N by a step other than ń then M°? > N°. 


Remark 35. The statement is false for the p rule [a]. N —p N (a ¢ fI(N)), as 
it removes a generator but not an element from the stream. Arguably, for this 
reason the rule should be excluded from the calculus. On the other hand, the 
rule is necessary to implement idempotence of @, rather than just &, as follows. 


N@N = [a].N&EN 5, [a].N =p N where ag fl(N) 


The below proposition then expresses that projective reduction is an invari- 
ant for permutative reduction. If N —, M by a step (that is not A) on a labeled 
generator [a] or a corresponding choice &, then N and M reduce to a common 
term, N >, P+ M, by the projective steps evaluating [a]. 


Proposition 36. Projective reduction is an invariant for permutative reduction, 
as follows (with a case for c2 symmetric to c1, and where D|] is a context). 


[m]. CINS.N] — [2]. CIN] mli. C[(NoSM) SN] => [E]. CIN SNi] 
se TP ae “ee 
T? (CIN]) ne (CIN) 
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dz. [a]t. N > [a]. Ax. N (EJ. N)M > [a]. NM 
z) À Le z} f Jr 
At. nS(N) = nẹ(àx. N) Te(N)M = T?(NM) 


7 Call-by-value Interpretation 


We consider the interpretation of a call-by-value probabilistic A-calculus. For 
simplicity we will allow duplicating (or deleting) 8-redexes, and only restrict 
duplicating probabilities; our values V are then just deterministic—7.e. without 
choices—terms, possibly applications and not necessarily 6-normal (so that our 
—>py is actually -reduction on deterministic terms, unlike [9]). We evaluate the 
internal probabilistic choice ®, to an external probabilistic choice +. 


N := x | àz N| MN| Me N (Axz.N)V py N[V/a] 
VW ::= z | àz.V | VW MaN =» M+N 


The interpretation |N]v of a call-by-value term N into Ape is given as follows. 
First, we translate N to a label-open term [N]open = 0 Fz P by replacing each 
choice @, with one 6 with a unique label, where the label-context 6 collects the 
labels used. Then [N]y is the label closure [N], = |0 Fz P|, which prefixes P 
with a generator [a] for every a in 6. 


Definition 37. (Call-by-value interpretation) The open interpretation [N]open 
of a call-by-value term XN is as follows, where all labels are fresh, and inductively 
[Nillopen = 0; Fz F; for i € {1,2}. 
[Z] open = FE T [Ni N2]open = b2 i 01 Fr Pi P> 
[Ax.Ni]open = Cal Fy Ax. P; [Ni By Nollopen = b2 . Cal “a Fz P, & P> 


The label closure |0 Fy P| is given inductively as follows. 
[FEP] =P |la-0Fr Pl = leh, [a]. P] 
The call-by-value interpretation of N is [N]v = |LN]open]- 


Our call-by-value reduction may choose an arbitrary order in which to evalu- 
ate the choices ©, in a term N, but the order of generators in the interpretation 
[N]v is necessarily fixed. Then to simulate a call-by-value reduction, we cannot 
choose a fixed context stream a priori; all we can say is that for every reduction, 
there is some stream that allows us to simulate it. Specifically, a reduction step 
C[No &y Ni] v CLN;] where C|] is a call-by-value term context is simulated by 
the following projective step. 


.[e]'. BYE"... DIPS Pi] =s 
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Here, [C[No @ Mi]Jopen = 0 H; D[Po&P)] with D[] a Ape-context, and 0 giving 
rise to the sequence of generators ...[a].[b].[c]... in the call-by-value transla- 
tion. To simulate the reduction step, if b occupies the n-th position in 0, then the 
n-th position in the context stream s must be the element j. Since 6-reduction 
survives the translation and labeling process intact, we may simulate call-by- 
value probabilistic reduction by projective and 6-reduction. 


Theorem 38. If N =v gv V then |N]? >x, [V]v for some stream s € S. 


8 Conclusions and Future Work 


We believe our decomposition of probabilistic choice in A-calculus to be an ele- 
gant and compelling way of restoring confluence, one of the core properties of the 
A-calculus. Our probabilistic event A-calculus captures traditional call-by-name 
and call-by-value probabilistic reduction, and offers finer control beyond those 
strategies. Permutative reduction implements a natural and fine-grained equiv- 
alence on probabilistic terms as internal rewriting, while projective reduction 
provides a complementary and more traditional external perspective. 

There are a few immediate areas for future work. Firstly, within probabilistic 
A-calculus, it is worth exploring if our decomposition opens up new avenues in 
semantics. Secondly, our approach might apply to probabilistic reasoning more 
widely, outside the A-calculus. Most importantly, we will explore if our approach 
can be extended to other computational effects. Our use of streams interprets 
probabilistic choice as a read operation from an external source, which means 
other read operations can be treated similarly. A complementary treatment of 
write operations would allow us to express a considerable range of effects, in- 
cluding input/output and state. 
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Abstract. We study k-synchronizability: a system is k-synchronizable 
if any of its executions, up to reordering causally independent actions, 
can be divided into a succession of k-bounded interaction phases. We 
show two results (both for mailbox and peer-to-peer automata): first, the 
reachability problem is decidable for k-synchronizable systems; second, 
the membership problem (whether a given system is k-synchronizable) 
is decidable as well. Our proofs fix several important issues in previous 
attempts to prove these two results for mailbox automata. 


Keywords: Verification - Communicating Automata - A/Synchronous 
communication. 


1 Introduction 


Asynchronous message-passing is ubiquitous in communication-centric systems; 
these include high-performance computing, distributed memory management, 
event-driven programming, or web services orchestration. One of the parameters 
that play an important role in these systems is whether the number of pending 
sent messages can be bounded in a predictable fashion, or whether the buffering 
capacity offered by the communication layer should be unlimited. Clearly, when 
considering implementation, testing, or verification, bounded asynchrony is pre- 
ferred over unbounded asynchrony. Indeed, for bounded systems, reachability 
analysis and invariants inference can be solved by regular model-checking [5]. 
Unfortunately and even if designing a new system in this setting is easier, this is 
not the case when considering that the buffering capacity is unbounded, or that 
the bound is not known a priori . Thus, a question that arises naturally is how 
can we bound the “behaviour” of a system so that it operates as one with un- 
bounded buffers? In a recent work [4], Bouajjani et al. introduced the notion of 
k-synchronizable system of finite state machines communicating through mail- 
boxes and showed that the reachability problem is decidable for such systems. 
Intuitively, a system is k-synchronizable if any of its executions, up to reordering 
causally independent actions, can be chopped into a succession of k-bounded in- 
teraction phases. Each of these phases starts with at most k send actions that are 
followed by at most k receptions. Notice that, a system may be k-synchronizable 
even if some of its executions require buffers of unbounded capacity. 

As explained in the present paper, this result, although valid, is surprisingly 
non-trivial, mostly due to complications introduced by the mailbox semantics of 
© The Author(s) 2020 
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communications. Some of these complications were missed by Bouajjani et al. 
and the algorithm for the reachability problem in [4] suffers from false positives. 
Another problem is the membership problem for the subclass of k-synchronizable 
systems: for a given k and a given system of communicating finite state machines, 
is this system k-synchronizable? The main result in [4] is that this problem is 
decidable. However, again, the proof of this result contains an important flaw at 
the very first step that breaks all subsequent developments; as a consequence, 
the algorithm given in [4] produces both false positives and false negatives. 

In this work, we present a new proof of the decidability of the reachability 
problem together with a new proof of the decidability of the membership pro- 
blem. Quite surprisingly, the reachability problem is more demanding in terms of 
causality analysis, whereas the membership problem, although rather intricate, 
builds on a simpler dependency analysis. We also extend both decidability results 
to the case of peer-to-peer communication. 


Outline. Next section recalls the definition of communicating systems and re- 
lated notions. In Section 3 we introduce k-synchronizability and we give a graphi- 
cal characterisation of this property. This characterisation corrects Theorem 1 
in [4] and highlights the flaw in the proof of the membership problem. Next, 
in Section 4, we establish the decidability of the reachability problem, which is 
the core of our contribution and departs considerably from [4]. In Section 5, we 
show the decidability of the membership problem. Section 6 extends previous 
results to the peer-to-peer setting. Finally Section 7 concludes the paper dis- 
cussing other related works. Proofs and some additional material are available 
at https://hal.archives-ouvertes.fr/hal-02272347. 


2 Preliminaries 


A communicating system is a set of finite state machines that exchange messages: 
automata have transitions labelled with either send or receive actions. The paper 
mainly considers as communication architecture, mailboxes: i.e., messages await 
to be received in FIFO buffers that store all messages sent to a same automaton, 
regardless of their senders. Section 6, instead, treats peer-to-peer systems, their 
introduction is therefore delayed to that point. 

Let V be a finite set of messages and P a finite set of processes. A send 
action, denoted send(p,q,v), designates the sending of message v from process 
p to process q. Similarly a receive action rec(p,q,v) expresses that process q 
is receiving message v from p. We write a to denote a send or receive action. 
Let S = {send(p,q,v) | p,q € P,v € V} be the set of send actions and 
R = {rec(p,q,v) | p,q € P,v € Y} the set of receive actions. Sp and Rp stand 
for the set of sends and receives of process p respectively. Each process is encoded 
by an automaton and by abuse of notation we say that a system is the parallel 
composition of processes. 


Definition 1 (System). A system is a tuple © = ((Lp, 8p, 18) | p € P) where, 


for each process p, Lp is a finite set of local control states, ôp C (Lp x (Sp U Rp) x 
Lp) is the transition relation (also denoted | +, I’) and 19 is the initial state. 
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Definition 2 (Configuration). Let © = ((Lp,5p,/°) | p € P), a configuration 


is a pair (1,Buf) where | = (lp)per E€ IpepLy is a global control state of © (a 
local control state for each automaton), and Buf = (bp)pep € (V*)* is a vector 
of buffers, each bp being a word over V. 


We write l to denote the vector of initial states of all processes p € P, and Bufo 
stands for the vector of empty buffers. The semantics of a system is defined by 
the two rules below. 


[SEND] [RECEIVE] 


send(p,q,v rec(p,q,v 
(p,4,v) me b, = ba v lg (p,q,v) T 


bg =v- b; 


lp 


(E But) 22L, (U lp], Buf [bi,/bg]) (E But) Ee, (U /14], Buf [b /b4]) 


A send action adds a message in the buffer b of the receiver, and a receive action 
pops the message from this buffer. An execution e = a1 ` -an is a sequence of 
actions in S U R such that (Io, Bufo) => 2n (i, Buf) for some l and Buf. 
As usual $ stands for “+ --. =, We write asEx(G) to denote the set of 
asynchronous executions of a system G. In a sequence of actions e = a, ---dn, 
a send action a; = send(p,q,v) is matched by a reception a; = rec(p’,q’, v’) 
(denoted by a; H aj) if i < j, p = p', q = q', v = v', and there is £ > 1 such 
that a; and a; are the ¢th actions of e with these properties respectively. A send 
action a; is unmatched if there is no matching reception in e. A message exchange 
of a sequence of actions e is a set either of the form v = {a;,a;} with a; H aj or 
of the form v = {a;} with a; unmatched. For a message v;, we will note v; the 
corresponding message exchange. When v is either an unmatched send(p, q, v) 
or a pair of matched actions {send(p, q, v), rec(p,q,v)}, we write procg(v) for p 
and procp(v) for q. Note that procp(v) is defined even if v is unmatched. Finally, 
we write procs(v) for {p} in the case of an unmatched send and {p,q} in the case 
of a matched send. 

An execution imposes a total order on the actions. We are interested in 
stressing the causal dependencies between messages. We thus make use of mes- 
sage sequence charts (MSCs) that only impose an order between matched pairs 
of actions and between the actions of a same process. Informally, an MSC will be 
depicted with vertical timelines (one for each process) where time goes from top 
to bottom, that carry some events (points) representing send and receive actions 
of this process (see Fig. 1). An arc is drawn between two matched events. We 
will also draw a dashed arc to depict an unmatched send event. An MSC is, thus, 
a partially ordered set of events, each corresponding to a send or receive action. 


Definition 3 (MSC). A message sequence chart is a tuple (Ev, A, <), where 


— Ev is a finite set of events, 
— à: Ev> SUR tags each event with an action, 
— <= (<po U ~src)t is the transitive closure of <po and <sre where: 
e <po is a partial order on Ev such that, for all process p, <p. induces a 
total order on the set of events of process p, i.e., on A~'(Sp U Rp) 
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Fig. 1: (a) and (b): two MSCs that violate causal delivery. (c) and (d): an MSC 
and its conflict graph 


© <src is a binary relation that relates each receive event to its preceding 


send event : 
x for all events r € ATH(R), there is exactly one events s such that 
S sre T 
x for all events s € \~1(S), there is at most one event r such that 
S sre T 


x for any two events s,r such that S <sre r, there are p,q, v such that 
A(s) = send(p,q,v) and A(r) = rec(p, q, v). 


We identify MSCs up to graph isomorphism (i.e., we view an MSC as a labeled 
graph). For a given well-formed (i.e., each reception is matched) sequence of 
actions € = a1 ... an, we let msc(e) be the MSC where Ev = [1..n], <po is the 
set of pairs of indices (i,j) such that i < j and {a;,a;} C Sp U Rp for some 
p €P (ie., a; and a; are actions of a same process), and <sre is the set of pairs 
of indices (i, j) such that a; H aj. We say that e = a...an is a linearisation 
of msc(e), and we write asTr(6) to denote {msc(e) | e € asEx(G)} the set of 
MSCs of system 6. 

Mailbox communication imposes a number of constraints on what and when 
messages can be read. The precise definition is given below, we now discuss some 
of the possible scenarios. For instance: if two messages are sent to a same process, 
they will be received in the same order as they have been sent. As another 
example, unmatched messages also impose some constraints: if a process p sends 
an unmatched message to r, it will not be able to send matched messages to r 
afterwards (Fig. 1a); or similarly, if a process p sends an unmatched message to 
r, any process q that receives subsequent messages from p will not be able to 
send matched messages to r afterwards (Fig. 1b). When an MSC satisfies the 
constraint imposed by mailbox communication, we say that it satisfies causal 
delivery. Notice that, by construction, all executions satisfy causal delivery. 


Definition 4 (Causal Delivery). Let (Ev,\,<) be an MSC. We say that it 
satisfies causal delivery if the MSC has a linearisation e = a1 ...an such that for 
any two events i < j such thata; = send(p,q,v) anda; = send(p',q,v’), either 
aj is unmatched, or there are i,j’ such that a; H ay, aj H aj, andi’ < j’. 


Our definition enforces the following intuitive property. 
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Proposition 1. An MSC msc satisfies causal delivery if and only if there is a 
system © and an execution e € asEx(G) such that msc = msc(e). 


We now recall from [4] the definition of conflict graph depicting the causal 
dependencies between message exchanges. Intuitively, we have a dependency 
whenever two messages have a process in common. For instance an = depen- 
dency between message exchanges v and v’ expresses the fact that v’ has been 
sent after v, by the same process. 


Definition 5 (Conflict Graph). The conflict graph CG(e) of a sequence of 
actions e = Q,+++Qp, is the labeled graph (V, (le vee) where V is the set 
of message exchanges of e, and for all X,Y € {S, R}, for all v,v' E€ V, there is 
a XY dependency edge v ZY vu! between v and v' if there are i < j such that 
{ai} =uNX, {aj} =v' NY, and procy(v) = procy(v’). 


Notice that each linearisation e of an MSC will have the same conflict graph. 
We can thus talk about an MSC and the associated conflict graph. (As an exam- 
ple see Figs. 1c and 1d.) 

We write v + v! if v **s v! for some X,Y € {R, S}, and v >* v’ if there is 
a (possibly empty) path from v to v’. 


3 k-synchronizable Systems 


In this section, we define k-synchronizable systems. The main contribution of 
this part is a new characterisation of k-synchronizable executions that corrects 
the one given in [4]. 

In the rest of the paper, k denotes a given integer k > 1. A k-exchange 
denotes a sequence of actions starting with at most k sends and followed by at 
most & receives matching some of the sends. An MSC is k-synchronous if there 
exists a linearisation that is breakable into a sequence of k-exchanges, such that 
a message sent during a k-exchange cannot be received during a subsequent one: 
either it is received during the same k-exchange, or it remains orphan forever. 


Definition 6 (k-synchronous). An MSC msc is k-synchronous if: 


1. there exists a linearisation of msc e = e1 : €2:++€n where for all i € [1..n], 
e; E€ SSF. RSF, 

2. msc satisfies causal delivery, 

3. for all j, j' such that aj Haj holds in e, aj Ha; holds in some e;. 


An execution e is k-synchronizable if msc(e) is k-synchronous. 


We write sT'r;,(G) to denote the set {msc(e) | e € asEx(G) and msc(e) is 
k-synchronous}. 


Example 1 (k-synchronous MSCs and k-synchronizable Executions). 
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Fig. 2: (a) the MSC of Example 1.1. (b) the MSC of Example 1.2. (c) the MSC 
of Example 2 and (d) its conflict graph. 


1. There is no k such that the MSC in Fig. 2a is k-synchronous. All messages 
must be grouped in the same k-exchange, but it is not possible to schedule 
all the sends first, because the reception of vı happens before the sending of 
v3. Still, this MSC satisfies causal delivery. 

2. Let e} = send(r, q, V3): send(q, p, V2): send(p, q, v1) -rec(q, p, V2) rec(r, q, V3) 
be an execution. Its MSC, msc(e1) depicted in Fig. 2b satisfies causal deliv- 
ery. Notice that e; can not be divided in 1-exchanges. However, if we consider 
the alternative linearisation of msc(e1): e2 = send(p, q, v1) - send(q, p, V2) - 
rec(q, p, V2) - send(r, q, v3) -rec(r, g, v3), we have that ez is breakable into 1- 
exchanges in which each matched send is in a 1-exchange with its reception. 
Therefore, msc(e;) is 1-synchronous and e; is 1-synchronizable. Remark that 
e2 is not an execution and there exists no execution that can be divided into 
1-exchanges. A k-synchronous MSC highlights dependencies between mes- 
sages but does not impose an order for the execution. 


Comparison with [4]. In [4], the authors define set sEx,(G) as the set of k- 
synchronous executions of system G in the k-synchronous semantics. Nonetheless 
as remarked in Example 1.2 not all executions of a system can be divided into 
k-exchanges even if they are k-synchronizable. Thus, in order not to lose any 
executions, we have decided to reason only on MSCs (called traces in [4]). 


Following standard terminology, we say that a set U C V of vertices is a 
strongly connected component (SCC) of a given graph (V, —) if between any two 
vertices v,uv’ € U, there exist two oriented paths v —* v’ and v’ >* v. The 
statement below fixes some issues with Theorem 1 in [4]. 


Theorem 1 (Graph Characterisation of k-synchronous MSCs). Let msc 
be a causal delivery MSC. msc is k-synchronous iff every SCC in its conflict 
graph is of size at most k and if no RS edge occurs on any cyclic path. 


Example 2 (A 5-synchronous MSC). Fig. 2c depicts a 5-synchronous MSC, that 
is not 4-synchronous. Indeed, its conflict graph (Fig. 2d) contains a SCC of size 
5 (all vertices are on the same SCC). 
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Comparison with [4]. Bouajjani et al. give a characterisation of k-synchronous 
executions similar to ours, but they use the word cycle instead of SCC, and 
the subsequent developments of the paper suggest that they intended to say 
Hamiltonian cycle (i.e., a cyclic path that does not go twice through the same 
vertex). It is not the case that a MSC is k-synchronous if and only if every 
Hamiltonian cycle in its conflict graph is of size at most k and if no RS edge 
occurs on any cyclic path. Indeed, consider again Example 2. This graph is not 
Hamiltonian, and the largest Hamiltonian cycle indeed is of size 4 only. But as we 
already discussed in Example 2, the corresponding MSC is not 4-synchronous. 

As a consequence, the algorithm that is presented in [4] for deciding whether 
a system is k-synchronizable is not correct as well: the MSC of Fig. 2c would be 
considered 4-synchronous according to this algorithm, but it is not. 


4 Decidability of Reachability for k-synchronizable 
Systems 


We show that the reachability problem is decidable for k-synchronizable systems. 
While proving this result, we have to face several non-trivial aspects of causal 
delivery that were missed in [4] and that require a completely new approach. 


Definition 7 (k-synchronizable System). A system G is k-synchronizable 
if all its executions are k-synchronizable, i.e., sTr;,(G) = asTr(G). 


In other words, a system G is k-synchronizable if for every execution e of G, 
msc(e) may be divided into k-exchanges. 


Remark 1. In particular, a system may be k-synchronizable even if some of its 
executions fill the buffers with more than k messages. For instance, the only 
linearisation of the 1-synchronous MSC Fig. 2b that is an execution of the system 
needs buffers of size 2. 


For a k-synchronizable system, the reachability problem reduces to the rea- 
chability through a k-synchronizable execution. To show that k-synchronous 
reachability is decidable, we establish that the set of k-synchronous MSCs is 
regular. More precisely, we want to define a finite state automaton that accepts 
a sequence € - €z- €n of k-exchanges if and only if they satisfy causal delivery. 

We start by giving a graph-theoretic characterisation of causal delivery. For 
this, we define the extended edges v My of a given conflict graph. The relation 
Z% is defined in Fig. 3 with X,Y € {S, R}. Intuitively, v Yo! expresses that 
event X of v must happen before event Y of v’ due to either their order on 
the same machine (Rule 1), or the fact that a send happens before its matching 
receive (Rule 2), or due to the mailbox semantics (Rules 3 and 4), or because 
of a chain of such dependencies (Rule 5). We observe that in the extended con- 
flict graph, obtained applying such rules, a cyclic dependency appears whenever 
causal delivery is not satisfied. 
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Fig. 3: Deduction rules for extended dependency edges of the conflict graph 


Example 3. Fig. 5a and 5b depict an MSC and its associated conflict graph with 
some extended edges. This MSC violates causal delivery and there is a cyclic 


SS 
dependency vı --> v. 


Theorem 2 (Graph-theoretic Characterisation of Causal Delivery). An 
MSC satisfies causal delivery iff there is no cyclic causal dependency of the form 


SS 
v --» v for some vertex v of its extended conflict graph. 


Let us now come back to our initial problem: we want to recognise with finite 
memory the sequences €1,€2...€,, of k-exchanges that composed give an MSC 
that satisfies causal delivery. We proceed by reading each k-exchange one by one 
in sequence. This entails that, at each step, we have only a partial view of the 
global conflict graph. Still, we want to determine whether the acyclicity condition 
of Theorem 2 is satisfied in the global conflict graph. The crucial observation 
is that only the edges generated by Rule 4 may “go back in time”. This means 
that we have to remember enough information from the previously examined k- 
exchanges to determine whether the current k-exchange contains a vertex v that 
shares an edge with some unmatched vertex v’ seen in a previous k-exchange 
and whether this could participate in a cycle. This is achieved by computing two 
sets of processes C's, and CR,p that collect the following information: a process 
q is in Cs,p if it performs a send action causally after an unmatched send to 
p, or it is the sender of the unmatched send; a process q belongs to CR,» if it 
receives a message that was sent after some unmatched message directed to p. 
More precisely, we have: 


Cs.» = {procg(v) | v’ °°, v & v' is unmatched & procp(v’) = p} 
CR p = {procr(v) | v’ ZS, v & v' is unmatched & procp(v') =p& vn RF OG} 


These sets abstract and carry from one k-exchange to another the necessary 
information to detect violations of causal delivery. We compute them in any local 
conflict graph of a k-exchange incrementally, i.e., knowing what they were at the 
end of the previous k-exchange, we compute them at the end of the current one. 
More precisely, let e = $1-+-+Sm-+T1++:Tm be a k-exchange, CG(e) = (V, E) its 
conflict graph and B : P — (2” x 2P) the function that associates to each p € P 
the two sets B(p) = (Cs,»,Cr,p). Then, the conflict graph CG(e, B) is the graph 
(V’, E") with V’ = VU {yp | p € P} and F’ D E as defined below. For each 
process p € P, the “summary node” pp shall account for all past unmatched 
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Fig. 4: Definition of the relation 25 


messages sent to p that occurred in some k-exchange before e. E’ is the set E 


of edges * among message exchanges of e, as in Definition 5, augmented with 
the following set of extra edges that takes into account summary nodes. 


{bp Sy | procy(v) E€ Cs & uN X FO for some X € {S, R}} (1) 
U {dp 25 v | procy(v) € Crp & VAR 0 for some X €{5,R}} (2) 
U {Yp ŽS v | procg(v) € Crp & v is unmatched} (3) 
U {u 25 yp | proca(v) =p & VN RAO} U {a ŽS Vpl pE Cra} (4) 


These extra edges summarise/abstract the connections to and from previous 


k-exchanges. Equation (1) considers connections 2S, and ŽŽ that are due to 
two sends messages or, respectively, a send and a receive on the same process. 
Equations (2) and (3) considers connections AR and 45 that are due to two 
received messages or, respectively, a receive and a subsequent send on the same 
process. Notice how the rules in Fig. 3 would then imply the existence of a 

; SS o., ; i SS 
connection --+, in particular Equation (3) abstract the existence of an edge --> 
built because of Rule 4. Equations in (4) abstract edges that would connect the 
current k-exchange to previous ones. As before those edges in the global conflict 
graph would correspond to extended edges added because of Rule 4 in Fig. 3. 
Once we have this enriched local view of the conflict graph, we take its extended 


XY 
version. Let --> denote the edges of the extended conflict graph as defined from 
rules in Fig. 3 taking into account the new vertices Yp and their edges. 


Finally, let G be a system and = be the transition relation given in Fig. 4 
Cc 


among abstract configurations of the form (t, B). lis a global control state of 

G and B : P + (2” x 2”) is the function defined above that associates to each 

process p a pair of sets of processes B(p) = (Cs,p, Cr,p). Transition = updates 
a 


these sets with respect to the current k-exchange e. Causal delivery is verified by 
checking that for all p € P, p ¢ Ch p Meaning that there is no cyclic dependency 
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Fig. 5: (a) an MSC (b) its associated global conflict graph, (c) the conflict graphs 


of its k-exchanges 


as stated in Theorem 2. The initial state is (lọ, Bo), where Bo : P > (2P x 2P) 
denotes the function such that Bo(p) = (0,0) for all p € P. 


Example 4 (An Invalid Execution). Let e = e1 - e2 with e, and ez the two 
2-exchanges of this execution. such that e} = send(q,r,vi) - send(q, s, v2) - 
rec(q,s,V2) and eg = send(p,s,v3) - rec(p, s, v3) - send(p,r, v4) - rec(p, r, v4). 
Fig. 5a and 5c show the MSC and corresponding conflict graph of each of the 
2-exchanges. Note that two edges of the global graph (in blue) “go across” k- 
exchanges. These edges do not belong to the local conflict graphs and are mim- 
icked by the incoming and outgoing edges of summary nodes. The values of 
sets Cs, and Cr, at the beginning and at the end of the k-exchange are given 
on the right. All other sets Cg, and Cr,» for p # r are empty, since there is 
only one unmatched message to process r. Notice how at the end of the second 
k-exchange, r € C'R „ Signalling that message v4 violates causal delivery. 


Comparison with [4]. In [4] the authors define = in a rather different way: 


they do not explicitly give a graph-theoretic hamaran of causal delivery; 
instead they compute, for every process p, the set B(p) of processes that either 
sent an unmatched message to p or received a message from a process in B(p). 
They then make sure that any message sent to p by a process q € B(p) is 
unmatched. According to that definition, the MSC of Fig. 5b would satisfy causal 
delivery and would be 1-synchronous. However, this is not the case (this MSC 
does not satisfy causal delivery) as we have shown in Example 3. Due to to the 
above errors, we had to propose a considerably different approach. The extended 
edges of the conflict graph, and the graph-theoretic characterisation of causal 
delivery as well as summary nodes, have no equivalent in [4]. 


Next lemma proves that Fig. 4 properly characterises causal delivery. 
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Lemma 1. An MSC msc is k-synchronous iff there ise = e1---en a lineari- 
e1,k 


sation such that (I, Bo) = o i (Ï, B’) for some global state Ù and some 
B' : P > (2 x 2), 


_, Note that there are only finitely many abstract configurations of the form 
(l, B) with l a tuple of control states and B : P + (2P x 2”). Moreover, since V 
is finite, the Ahaba over the possible k-exchange for a given k is also finite. 


Therefore =3 is a relation on a finite set, and the set sTr(6) of k-synchronous 


MSCs of a Galen G forms a regular language. It follows that it is decidable 
whether a given abstract configuration of the form (Ù, B) is reachable from the 
initial configuration following a k-synchronizable execution. 


Theorem 3. Let G be a k-synchronizable system and la global control state of 
©. The problem whether there exists e € asEx(G) and Buf such that (lo, Bufo) > 
(i, Buf) is decidable. 


Remark 2. Deadlock-freedom, unspecified receptions, and absence of orphan mes- 
sages are other properties that become decidable for a k-synchronizable system 
because of the regularity of the set of k-synchronous MSCs. 


5 Decidability of k-synchronizability for Mailbox Systems 


We establish the decidability of k-synchronizability; our approach is similar to 
the one of [4] based on the notion of borderline violation, but we adjust it to 
adapt to the new characterisation of k-synchronizable executions (Theorem 1). 


Definition 8 (Borderline Violation). A non k-synchronizable execution e is 
a borderline violation if e = e' -r,r is a reception and e' is k-synchronizable. 


Note that a system G that is not k-synchronizable always admits at least one 
borderline violation e’ -r € asEx(G) with r € R: indeed, there is at least one 
execution e € asEx(G) which contains a unique minimal prefix of the form e’-r 
that is not k-synchronizable; moreover since e’ is k-synchronizable, r cannot be a 
k-exchange of just one send action, therefore it must be a receive action. In order 
to find such a borderline violation, Bouajjani et al. introduced an instrumented 
system ©’ that behaves like G, except that it contains an extra process 7, and 
such that a non-deterministically chosen message that should have been sent 
from a process p to a process q may now be sent from p to 7, and later forwarded 
by 7 to q. In ©’, each process p has the possibility, instead of sending a message 
v to q, to deviate this message to 7; if it does so, p continues its execution as if it 
really had sent it to q. Note also that the message sent to m get tagged with the 
original destination process q. Similarly, for each possible reception, a process 
has the possibility to receive a given message not from the initial sender but from 
am. The process 7 has an initial state from which it can receive any messages from 
the system. Each reception makes it go into a different state. From this state, 
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it is able to send the message back to the original recipient. Once a message 
is forwarded, m reaches its final state and remains idle. The following example 
illustrates how the instrumented system works. 


Example 5 (A Deviated Message). 

Let e1, €2 be two executions of a system © with 
MSCs respectively msc(e;) and msc(ez). e is not 1- 
synchronizable. It is borderline in G. If we delete the last 
reception, it becomes indeed 1-synchronizable. msc(e2) 


P 
[ 2. 
is the MSC obtained from the instrumented system G6’ x ve 


N i 


msc(e1) msc(e2) 


where the message vı is first deviated to m and then 
sent back to q from 7. 

Note that msc(e2) is 1-synchronous. In this case, the 
instrumented system G’ in the 1-synchronous semantics 
“reveals” the existence of a borderline violation of 6. 


For each execution e:r € asEa(G) that ends with a reception, there exists 
an execution deviate(e- r) € asEx(G6’) where the message exchange associated 
with the reception r has been deviated to 7; formally, if e-r = e,-s-e2-r with 
r = rec(p,q, v) and s H r, then 


deviate(e-r) = e,-send(p, n, (q, v))-rec(p, 7, (q, v))-€2:send(m, q, (v))-rec(, q, v). 


Definition 9 (Feasible Execution, Bad Execution). A k-synchronizable 
execution e' of ©' is feasible if there is an execution e-r € asEx(G) such that 
deviate(e-r) =e’. A feasible execution e' = deviate(e-r) of G’ is bad if execution 
e-r is not k-synchronizable in ©. 


Example 6 (A Non-feasible Execution). P 4 T P q 
Let e’ be an execution such that msc(e’) is as depicted (4, v1) 
on the right. Clearly, this MSC satisfies causal delivery 
and could be the execution of some instrumented system 
6’. However, the sequence e-r such that deviate(e-r) = e’ vi 
does not satisfy causal delivery, therefore it cannot be 
an execution of the original system G. In other words, 
the execution e’ is not feasible. 


<> 


msc(e’) — msc(e-r) 


Lemma 2. A system © is not k-synchronizable iff there is a k-synchronizable 
execution e' of ©' that is feasible and bad. 


As we have already noted, the set of k-synchronous MSCs of 6’ is regular. 
The decision procedure for k-synchronizability follows from the fact that the 
set of MSCs that have as linearisation a feasible bad execution as we will see, 
is regular as well, and that it can be recognised by an (effectively computable) 
non-deterministic finite state automaton. The decidability of k-synchronizability 
follows then from Lemma 2 and the decidability of the emptiness problem for 
non-deterministic finite state automata. 
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Recognition of Feasible Executions. We start with the automaton that 
recognises feasible executions; for this, we revisit the construction we just used 
for recognising sequences of k-exchanges that satisfy causal delivery. 

In the remainder, we assume an execution e € asExz(G’) that contains 
exactly one send of the form send(p,7,(q,v)) and one reception of the form 


rec(m, q, V), this reception being the last action of e’. Let (V, (S jeres be 
the conflict graph of e’. There are two uniquely determined vertices Ustart, Ustop € 
V such that procp(Ustat) = 7 and procg(vstop) = 7 that correspond, respectively, 
to the first and last message exchanges of the deviation. The conflict graph of 
e-r is then obtained by merging these two nodes. 


Lemma 3. The execution e' is not feasible iff there is a verter v in the conflict 
SS 
graph of e! such that Ustat --> V pid Ustop « 


In order to decide whether an execution e’ is feasible, we want to forbid that 
a send action send(p’,q,v’) that happens causally after Ustart is matched by a 
receive rec(p’,q, v’) that happens causally before the reception Vstop. As a matter 
of fact, this boils down to deal with the deviated send action as an unmatched 
send. we will consider sets of processes CZ and CR similar to the ones used 


for 35, but with the goal of computing which actions happen causally after the 


send | to am. We also introduce a summary node Vstart and the extra edges following 
the same principles as in the previous section. Formally, let B : P — (2° x 2P), 
Cz,C®R C P and e € SS*RS* be fixed, and let CG(e,B) = (V’, E’) be the 
constraint graph with summary nodes for unmatched sent messages as defined 
in the previous section. The local constraint graph CG(e, B, C3, CR) is defined 
as the graph (V”, E”) where V” = V’ U {Weta} and E” is E’ augmented with 


{tsan —> v | procy (v) € CZ & uN X £0 for some X € {S, R} 
U {tstart => v | procy (v) € CZ & vN R £ 9 for some X € {S, R}} 


U {star =S v | proca(v) € CF & v is unmatched} U {tsar > Wp | p € CB} 


As before, we consider kap ‘closure” aoe of these edges by the rules of Fig. 3. 


The transition relation <=, is defined in Fig. 6. It relates abstract configurations 
feas 


of the form (I, B, Č, dest,,) with Č = (Cs,x,Cr,x) and dest, € PU{_L} storing to 
whom the message deviated to 7 was supposed to be delivered. Thus, the initial 
abstract configuration is (lo, Bo, (Ø, Ø), L), where L means that the processus 
dest, has not been determined yet. It will be set as soon as the send to process 
T is encountered. 


Lemma 4. Let e’ be an execution of ©'. Then e' is a k-synchronizable feasible 
execution iff there are e” = e1--+ en - send(m,q,v)-rec(m,q,v) with e1,...,en E 
SSERSK, B' : P+ P, Cle € (2")?, and a tuple of control states I’ such that 
msc(e’) = msc(e”), T d Cr q (with B'(q) = we and 


(i, Bo, (0,0), 1) 2S... = @,B',0,9). 


feas feas 
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= 


(ČB) 5 (U, B’) e =d: an (Vv) procg(v) Am 
(Vu, v') procp(v) = proca(v') =r => v=v' Adest, = 1 
(Vu) v 5 send(p,7,(q,v)) => dest, =q dest, Æ L => dest) = dest, 
Ch’ = CX U {procy(v’) | v Su! & aX Z Ú & (procp(v) = T or v = Wetart) } 
U {procs(v) | procp(v) =m & X = S} 


U {p| pE Cx &v a3 Ppa & (procp(v) = T or v = Wetart) } 
dest). ¢ Cf,’ 


(7, B, C3, Ch, dest) ==> (Ù, B’,C3’, Cf’, dest) 
Fig. 6: Definition of the relation =, 
eas 


Comparison with [4]. In [4] the authors verify that an execution is feasible with 
a monitor which reviews the actions of the execution and adds processes that 
no longer are allowed to send a message to the receiver of m. Unfortunately, we 
have here a similar problem that the one mentioned in the previous comparison 
paragraph. According to their monitor, the following execution e’ = deviate(e- r) 
is feasible, i.e., is runnable in G’ and e- r is runnable in G6. 


e = send(q,7, (r,v1)) - rec(q, 7, (r, v1))- send(q, s, V2) - rec(q, s, V2): 


send(p, 5, v3) i rec(p, S, v3) ` send(p, T, v4) ` rec(p, T, v4) 


send(T,r, V1) + rec(T,r, V4) 


However, this execution is not feasible because there is a causal dependency 
between vı and v3. In [4] this execution would then be considered as feasible 
and therefore would belong to set sTr(6'). Yet there is no corresponding exe- 
cution in asT’r(G), the comparison and therefore the k-synchronizability, could 
be distorted and appear as a false negative. 


Recognition of Bad Executions. Finally, we define a non-deterministic finite 
state automaton that recognizes MSCs of bad executions, i.e., feasible executions 
e’ = deviate(e - r) such that e -r is not k-synchronizable. We come back to the 


“non-extended” conflict graph, without edges of the form X3. Let Post” (v) = 
{v' € V | v >* v'} be the set of vertices reachable from v, and let Pre*(v) = 
{v' € V | v” >* v} be the set of vertices co-reachable from v. For a set of vertices 
U C V, let Post*(U) = U{Post*(v) | v € U}, and Pre*(U) = U{Pre*(v) |v € U}. 


Lemma 5. The feasible execution e' is bad iff one of the two holds 


x RS * 
1. Ustat —  —>—>" Ustop, OT 


2. the size of the set Post™ (Ustart) O Pre*(Ustop) is greater or equal to k + 2. 


In order to determine whether a given message exchange v of CG(e’) should 
be counted as reachable (resp. co-reachable), we will compute at the entry and 
exit of every k-exchange of e’ which processes are “reachable” or “co-reachable”. 
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Example 7. (Reachable and Co-reachable Processes) 
Consider the MSC on the right made of five 1-exchanges. 


While sending message (s, vo) that corresponds to Ustart; 1 n vol T 
process r becomes “reachable”: any subsequent message vi >| 
exchange that involves r corresponds to a vertex of the ad 
conflict graph that is reachable from vga. While send- v3 e] 

ing v2, process s becomes “reachable”, because process va [I 

r will be reachable when it will receive message v2. Sim- M] Vo 
ilary, q becomes reachable after receiving v3 because r A] 
was reachable when it sent v3, and p becomes reachable wela 


after receiving v4 because q was reachable when it sent 
v4. Co-reachability works similarly, but reasoning backwards on the timelines. 
For instance, process s stops being “co-reachable” while it receives vo, process 
r stops being co-reachable after it receives v2, and process p stops being co- 
reachable by sending vı. The only message that is sent by a process being both 
reachable and co-reachable at the instant of the sending is v2, therefore it is the 
only message that will be counted as contributing to the SCC. 


More formally, let e be sequence of actions, CG(e) its conflict graph and 
P,Q two sets of processes, Poste(P) = Post“ (tv | procs(v) N P # 0}) and 


Pre.(Q) = Pre* ({v | procs(v) N Q # 0}) are introduced to represent the local 


view through k-exchanges of Post“ (Ustat) and Pre“ (vstop). For instance, for e 
as in Example 7, we get Poste({7}) = {(s, vo), V2, V3, V4, Vo} and Pre.({7}) = 
{vo, V2, V1, (S, Vo)}. In each k-exchange e; the size of the intersection between 
Poste, (P) and Pree, (Q) will give the local contribution of the current k-exchange 


to the calculation of the size of the global SCC. In the transition relation => 
a 


this value is stored in variable cnt. The last ingredient to consider is to recognise 
if an edge RS belongs to the SCC. To this aim, we use a function lastisRec : 
P — {True, False} that for each process stores the information whether the last 
action in the previous k-exchange was a reception or not. Then depending on 
the value of this variable and if a node is in the current SCC or not the value of 
sawRS is set accordingly. 


; ik : 7 : 
The transition relation = defined in Fig. 7 deals with abstract confi- 
a 


gurations of the form (P, Q, cnt, sawRS, lastisRec’) where P,Q C P, sawRS is a 
boolean value, and cnt is a counter bounded by k+2. We denote by lastisRecg 
the function where all lastisRec(p) = False for all p € P. 


Lemma 6. Lete’ be a feasible k-synchronizable execution of 6’. Then e' is a bad 
execution iff there are e” = e1: -en - send(m,q,v)-rec(m,q,v) with e1,...,en E 
SSERS* and msc(e’) = msc(e”), P’,Q C P, sawRS € {True, False}, cnt € 
{0,...,4 +2}, such that 

en, k 


({}, Q, 0, False, lastisReco) 25, we aT (P', {r}, cnt, sawRS, lastisRec) 
a a 
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P’ = procs(Poste(P)) Q = procs(Pree(Q’)) 
SCC. = Poste( P) N Pree(Q’) 
ent’ = min(k + 2,cnt +n) where n=|SCC,| 
lastisRec’(q) = (Su E€ SCC..proca(v) =qAuNRAD)V 
(lastisRec(q)A Av E V.procg(v) = q) 


sawRS’ = sawRSV 
(Iv € SCC.)(Sp E P \ {7}) procs(v) = pA lastisRec(p) ^p E PAQ 


Jk 
(P,Q, cnt, sawRS, lastisRec) => (P’, Q’, cnt’, sawRS’, lastisRec’) 


a 


Fig. 7: Definition of the relation == 
a 


and at least one of the two holds: either sawRS = True, or cnt = k + 2. 


Comparison with [4]. As for the notion of feasibility, to determine if an execution 
is bad, in [4] the authors use a monitor that builds a path between the send to 
process 7 and the send from 7. In addition to the problems related to the wrong 
characterisation of k-synchronizability, this monitor not only can detect an RS 
edge when there should be none, but also it can miss them when they exist. In 
general, the problem arises because the path is constructed by considering only 
an endpoint at the time. 


We can finally conclude that: 


Theorem 4. The k-synchronizability of a system © is decidable for k > 1. 


6 k-synchronizability for Peer-to-Peer Systems 


In this section, we will apply k-synchronizability to peer-to-peer systems. A peer- 
to-peer system is a composition of communicating automata where each pair of 
machines exchange messages via two private FIFO buffers, one per direction of 
communication. Here we only give an insight on what changes with respect to 
the mailbox setting. 

Causal delivery reveals the order imposed by FIFO buffers. Definition 4 must 
then be adapted to account for peer-to-peer communication. For instance, two 
messages that are sent to a same process p by two different processes can be 
received by p in any order, regardless of any causal dependency between the two 
sends. Thus, checking causal delivery in peer-to-peer systems is easier than in the 
mailbox setting, as we do not have to carry information on causal dependencies. 

Within a peer-to-peer architecture, MSCs and conflict graphs are defined 
as within a mailbox communication. Indeed, they represents dependencies over 
machines, i.e., the order in which the actions can be done on a given machine, and 
over the send and the reception of a same message, and they do not depend on 
the type of communication. The notion of k-exchange remains also unchanged. 


On the k-synchronizability of Systems 173 


Decidability of Reachability for k-synchronizable Peer-to-Peer Sys- 
tems. To establish the decidability of reachability for k-synchronizable peer-to- 


2 
jis ‘ e,k PeP ‘ 
peer systems, we define a transition relation == for a sequence of action e 
d 


(e 

describing a k-exchange. As for mailbox systems, if a send action is unmatched 
in the current k-exchange, it will stay orphan forever. Moreover, after a process 
p sent an orphan message to a process q, p is forbidden to send any matched 
message to q. Nonetheless, as a consequence of the simpler definition of causal 
delivery, , we no longer need to work on the conflict graph. Summary nodes and 
extended edges are not needed and all the necessary information is in function 
B that solely contains all the forbidden senders for process p. 

The characterisation of a k-synchronizable execution is the same as for mail- 
box systems as the type of communication is not relevant. We can thus conclude, 
as within mailbox communication, that reachability is decidable. 


Theorem 5. Let G be a k-synchronizable system and la global control state of 
©. The problem whether there exists e € asEx(G) and Buf such that (lo, Bufo) > 
(l,Buf) is decidable. 


Decidability of k-synchronizability for Peer-to-Peer Systems. As in 
mailbox system, the detection of a borderline execution determines whether a 
system is k-synchronizable. 


, ee Jk P2P : z , . 
The relation transition = allows to obtain feasible executions. Differ- 
eas 


ently from the mailbox setting, we need to save not only the recipient dest, but 
also the sender of the delayed message (information stored in variable exp,). 
The transition rule then checks that there is no message that is violating causal 
delivery, i.e., there is no message sent by exp, to dest, after the deviation. 


Finally the recognition of bad execution, works in the same way as for mailbox 

OE : oe „k PP 

systems. The characterisation of a bad execution and the definition of => 
a 


are, therefore, the same. 
As for mailbox systems, we can, thus, conclude that for a given k, k-synchro- 
nizability is decidable. 


Theorem 6. The k-synchronizability of a system © is decidable for k > 1. 


7 Concluding Remarks and Related works 


In this paper we have studied k-synchronizability for mailbox and peer-to-peer 
systems. We have corrected the reachability and decidability proofs given in [4]. 
The flaws in [4] concern fundamental points and we had to propose a consid- 
erably different approach. The extended edges of the conflict graph, and the 
graph-theoretic characterisation of causal delivery as well as summary nodes, 


. . oe , ik ik age 
have no equivalent in [4]. Transition relations => and = building on the 
eas a 
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graph-theoretic characterisations of causal delivery and k-synchronizability, de- 
part considerably from the proposal in [4]. 

We conclude by commenting on some other related works. The idea of “com- 
munication layers” is present in the early works of Elrad and Francez [8] or Chou 
and Gafni [7]. More recently, Chaouch-Saad et al. [6] verified some consensus al- 
gorithms using the Heard-Of Model that proceeds by “communication-closed 
rounds”. The concept that an asynchronous system may have an “equivalent” 
synchronous counterpart has also been widely studied. Lipton’s reduction [14] 
reschedules an execution so as to move the receive actions as close as possible 
from their corresponding send. Reduction recently received an increasing interest 
for verification purpose, e.g. by Kragl et al. [12], or Gleissenthal et al. [11]. 

Existentially bounded communication systems have been studied by Ge- 
nest et al. [10,15]: a system is existentially k-bounded if any execution can be 
rescheduled in order to become k-bounded. This approach targets a broader class 
of systems than k-synchronizability, because it does not require that the execu- 
tion can be chopped in communication-closed rounds. In the perspective of the 
current work, an interesting result is the decidability of existential k-boundedness 
for deadlock-free systems of communicating machines with peer-to-peer channels. 
Despite the more general definition, these older results are incomparable with 
the present ones, that deal with systems communicating with mailboxes, and 
not peer-to-peer channels. 

Basu and Bultan studied a notion they also called synchronizability, but it 
differs from the notion studied in the present work; synchronizability and k- 
synchronizability define incomparable classes of communicating systems. The 
proofs of the decidability of synchronizability [3,2] were shown to have flaws by 
Finkel and Lozes [9]. A question left open in their paper is whether synchroni- 
zability is decidable for mailbox communications, as originally claimed by Basu 
and Bultan. Akroun and Salaün defined also a property they called stability [1] 
and that shares many similarities with the synchronizability notion in [2]. 

Context-bounded model-checking is yet another approach for the automatic 
verification of concurrent systems. La Torre et al. studied systems of commu- 
nicating machines extended with a calling stack, and showed that under some 
conditions on the interplay between stack actions and communications, context- 
bounded reachability was decidable [13]. A context-switch is found in an exe- 
cution each time two consecutive actions are performed by a different partici- 
pant. Thus, while k-synchronizability limits the number of consecutive sendings, 
bounded context-switch analysis limits the number of times two consecutive ac- 
tions are performed by two different processes. 

As for future work, it would be interesting to explore how both context- 
boundedness and communication-closed rounds could be composed. Moreover 
refinements of the definition of k-synchronizability can also be considered. For 
instance, we conjecture that the current development can be greatly simplified 
if we forbid linearisation that do not correspond to actual executions. 
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Abstract. Delta lenses are an established mathematical framework for 
modelling and designing bidirectional model transformations (Bx). Fol- 
lowing the recent observations by Fong et al, the paper extends the delta 
lens framework with a a new ingredient: learning over a parameterized 
space of model transformations seen as functors. We will define a notion 
of an asymmetric learning delta lens with amendment (ala-lens), and 
show how ala-lenses can be organized into a symmetric monoidal (sm) 
category. We also show that sequential and parallel composition of well- 
behaved (wb) ala-lenses are also wb so that wb ala-lenses constitute a 
full sm-subcategory of ala-lenses. 


1 Introduction 


The goal of the paper is to develop a formal model of supervised learning in a 
very general context of bidirectional model transformation or Bz, i.e., synchro- 
nization of two arbitrary complex structures (called models) related by a trans- 
formation|"] Rather than learning parameterized functions between Euclidean 
spaces as is typical for machine learning (ML), we will consider learning map- 
pings between model spaces and formalize them as parameterized functors be- 
tween categories, f: PxA — B, with P being a parameter space. The basic 
ML-notion of a training pair (A, B’) € Ao x Bo will be considered as an incon- 
sistency between models caused by a change (delta) v: B — B’ of the target 
model B = f(p, A), p € P, that was first consistent with A w.r.t. the transfor- 
mation (functor) f(p,__). An inconsistency is repaired by an appropriate change 
of the source structure, u: A — A’, changing the parameter p to p’, and an 
amendment of the target structure v®: B’ + B® so that f(p’, A’) = B® isa 
consistent state of the parameterized two-model system. 

The setting above without parameterization and learning (i.e., p’ = p always 
holds), and without amendment (v® = idg; always holds), is well known in 


? 


the Bx literature under the name of delta lenses— mathematical structures, in 


‘Term Brz refers to a wide area including file synchronization, data exchange in 
databases, and model synchronization in Model-Driven software Engineering (MDE), 
see [7] for a survey. In the present paper, Bx will mainly refer to Bx in the MDE 
context. 
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which consistency restoration via change propagation is modelled by functorial- 
like algebraic operations over categories [12J6]. There are several types of delta 
lenses tailored for modelling different synchronization tasks and scenarios, partic- 
ularly, symmetric and asymmetric. In the paper, we only consider asymmetric 
delta lenses and will often omit explicit mentioning these attributes. Despite 
their extra-generality, (delta) lenses have been proved useful in the design and 
implementation of practical model synchronization systems with triple graph 
grammars (TGG) [BB]; enriching lenses with amendment is a recent extension 
of the framework motivated and formalized in [IJ]. A major advantage of the 
lens framework for synchronization is its compositionality: a lens satisfying sev- 
eral equational laws specifying basic synchronization requirements is called well- 
behaved (wb), and basic lens theorems state that sequential and parallel compo- 
sition of wb lenses is again wb. In practical applications, it allows the designer of 
a complex synchronizer to avoid integration testing: if elementary synchronizers 
are tested and proved to be wh, their composition is automatically wb too. 

The present paper makes the following contributions to the delta lens frame- 
work for Bx. a) We motivate model synchronization enriched with learning and, 
moreover, with categorical learning, in which the parameter space is a cate- 
gory, and introduce the notion of a wb asymmetric learning (delta) lens with 
amendment (a wb ala-lens) (this is the content of Sect. B}. b) We prove compo- 
sitionality of wb ala-lenses and show how their universe can be organized into a 
symmetric monoidal (sm) category (Theorems 1-3 in Sect. p}. All proofs (rather 
straightforward but notationally laborious) can be found in the long version of 
the paper [9]. One more compositional result is c) a definition of a compositional 
bidirectional transformation language (Def. |6) that formalizes an important re- 
quirement to model synchronization tools, which (surprisingly) is missing from 
the Bx literature. Background Sect. [2] provides a simple example demonstrat- 
ing main concepts of Bx and delta lenses in the MDE context. Section [| briefly 
surveys related work, and Sect. [6] concludes. 


Notation. Given a category A, its objects are denoted by capital letters A, A’, 
etc. to recall that in MDE applications, objects are complex structures, which 
themselves have elements a,a’,....; the collection of all objects of category A 
is denoted by Ag. An arrow with domain A € Apo is written as u: A > _ or 
u € A(A, _); we also write dom(u) = A (and sometimes u‘°™ = A to shorten 
formulas). Similarly, formula u: _ — A’ denotes an arrow with codomain u.cod = 
A’. Given a functor f: A — B, its object function is denoted by fo: Ao > Bo. 

A subcategory B C A is called wide if it has the same objects. All categories 
we consider in the paper are small. 


2 Background: Update propagation and delta lenses 


Although Bx ideas work well only in domains conforming to the slogan any im- 
plementation satisfying the specification is good enough such as code generation 
(see [IO] for discussion), and have limited applications in databases (only so 
called updatable views can be treated in the Bx-way), we will employ a simple 
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database example: it allows demonstrating the core ideas without any special 
domain knowledge required by typical Bx-amenable areas. The presentation will 
be semi-formal as our goal is to motivate the delta lens formalism that abstracts 
the details away rather than formalize the example as such. 


2.1 Why deltas. 


Bx-lenses first appeared in the work on file synchronization, and if we have two 
sets of strings, say, B = {John, Mary} and B’ = {Jon, Mary}, we can readily 
see the difference: John # Jon but Mary = Mary. We thus have a structure 
in-between B and B’ (which maybe rather complex if B and B’ are big files), 
but this structure can be recovered by string matching and thus updates can be 
identified with pairs. The situation dramatically changes if B and B’ are object 
structures, e.g., B = {01,02} with Name(o,;) = John, Name(oz) = Mary and 
similarly B’ = {01,05%} with Name(o/) = Jon, Name(o5) = Mary. Now string 
matching does not say too much: it may happen that 0; and o{ are the same 
object (think of a typo in the dataset), while 02 and o% are different (although 
equally named) objects. Of course, for better matching we could use full names 
or ID numbers or something similar (called, in the database parlance, primary 
keys), but absolutely reliable keys are rare, and typos and bugs can compromise 
them anyway. Thus, for object structures that Bx needs to keep in sync, deltas 
between models need to be independently specified, e.g., by specifying a same- 
ness relation u C BxB' between models. For example, u = {01,0} says that 
John@B and Jon@B’ are the same person while Mary@B and Mary@B’ are 
not. Hence, model spaces in Bx are categories (objects are models and arrows 
are update/delta specifications) rather than sets (codiscrete categories). 


2.2 Consistency restoration via update propagation: An Example 


Figure [1] presents a simple example of delta propagation for consistency restora- 
tion. Models consist of objects (in the sense of OO programming) with attributes 
(a.k.a. labelled records), e.g., the source model A consists of three objects iden- 
tified by their oids (object identifiers) #A, #J, #M (think about employees of 
some company) with attribute values as shown in the table: attribute Expr. refers 
to Experience measured by a number of years, and Depart. is the column of de- 
partment names. The schema of the table, i.e., the triple Sa of attributes (Name, 
Expr., Depart.) with their domains of values String, Integer, String resp., de- 
termines a model space A. A model X € A is given by its set of objects OID* 
together with three functions Name, Expr.*, Depart.* from the same domain 
OID* to targets String, Integer, String resp., which are compactly specified 
by tables as shown for model A. The target model space B is given by a similar 
schema Sg consisting of two attributes. The B-view get(X) of an A-model X 


is computed by selecting those oids #0 € OID* for which Depart.” (#0) is an 


IT-department, i.e., an element of the set IT = {ML, DB}. For example, the 


upper part of the figure shows the IT-view B of model A. 


180 Z. Diskin 


We assume that all column names in schemas Sa, and Sp are qualified by 
schema names, e.g., OID@S,, OID@Sg etc, so that schemas are disjoint except 
elementary domains like String etc. Also disjoint are OID-values, e.g., ##J@A and 
#J@B are different elements, but constants like John and Mary are elements of 
set String shared by both schemas. To shorten long expressions in the diagrams, 
we will often omit qualifiers and write #J = #J meaning #J@A = #J@B or 
#J@B = #JQB’' depending on the context given by the diagram; often we will 
also write #J and #.J’ for such OIDs. Also, when we write #J = #J inside 
block arrows denoting updates, we actually mean a pair, e.g., (#J@B, #J@B’). 


Given two models over the same schema, say, B and B’ over Sg, an update 
$ 
v: B > B' is a relation v C OID? xOID® ; if a schema contains several nodes, 
an update should provide a relation vy for each node N in the schema. 


Note an essential difference between the two parallel updates v1, v2: B > B’ 
specified in the figure. Update vı says that John’s name was changed to Jon 
(think of fixing a typo), and the experience data for Mary were also corrected 
(either because of a typo or, e.g., because the department started to use a new 
ML method for which Mary has a longer experience). Update v2 specifies the 
same story for John but a new story for Mary: it says that Mary #M left the 
IT-view and Mary #M’ is a new employee in one of IT-departments. 


Updated source A’, 
Name] Expr. 
Ann| 10 
Jon| 10 


pdated source A’, 


Name Depart. 


Ann Sales 
DB 


? (in IT) 


Source model A 6. ” Target (view) model B 
OIDs | Name | Expr | Depart. | 5-a. 0) -77-a > (IT-departments) 
AiR OIDs | Name | Expr. 
#A | Am 10 Sales jee? . 
#) | lom | 10 | DB j i ee 
Mary 


Updated source model A’,"" 


OIDs 


Name 


Expr. 


Depart. 


HA 
#) 
#M 
#M’ 


Ann 


10 
10 
(not IT) 


7 


Sales 
DB 
? (not IT) 


? (in IT) 


Name 


Expr. 


Depart. 


Ann 


Fig. 1: Example of update propagation 
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2.3 Update propagation and update policies 


The updated view B’ is inconsistent with the source A and the latter is to be 
updated accordingly — we say that update v is to be propagated back to A. Prop- 
agation of vı is easy: we just update accordingly the values of the attributes as 
shown in the figure in the block arrow u1: A — A; (of black colour). Importantly, 
propagation needs two pieces of data: the view update vı and the original state 
A of the source as shown in the figure by two data-flow lines into the chevron 
1:put; the latter denotes invocation of the backward propagation operation put 
(read “put view update back to the source”). The quadruple 1 = (v1, A, wi, A’) 
can be seen as an instance of operation put, hence the notation 1:put (borrowed 
from the UML). 

Propagation of update v2 is more challenging: Mary can disappear from the 
IT-view because a) she quit the company, b) she transitioned to a non-IT de- 
partment, and c) the view definition has changed, e.g., the new view must only 
show employees with experience more than 5 years. Choosing between these pos- 
sibilities is often called choosing an (update) policy. We will consider the case of 
changing the view in Sect. |3| and in the current section discuss policies a) and 
b) (ignore for a while the propagation scenario shown in blue in the right lower 
corner of the figure that shows policy c)). 

For policy a), further referred to as quiting and briefly denoted by qt, the 
result of update propagation is shown in the figure with green colour: notice 
the update (block) arrow u%* and its result, model Ast, produced by invoking 
operation put%. Note that while we know the new employee Mary works in one 
of IT departments, we do not know in which one. This is specified with a special 
value °? (a.k.a. labelled null in the database parlance). 

For policy b), further referred to as transition and denoted tr, the result of 
update propagation is shown in the figure with orange colour: notice update 
arrow uł and its result, model Aj” produced by put‘. Mary #M is the old 
employee who transitioned to a new non-IT department, for which her expertize 
is unknown. Mary #M’ is a new employee in one of IT-departments (we assume 
that the set of departments is not exhausted by those appearing in a particular 
state A € A). There are also updates whose backward propagation is uniquely 
defined and does not need a policy, e.g., update vı is such. 

An important property of update propagations we have considered is that 
they restore consistency: the view of the updated source equals to the updated 
view initiated the update: get(A’) = B’; moreover, this equality extends for 
update arrows: get(u;) = v;, i = 1,2. Such extensions can be derived from view 
definitions if the latter are determined by so called monotonic queries (which 
encompass a wide class of practically useful queries including the Select-Project- 
Join class). For views defined by non-monotonic queries, in order to obtain get’s 
action on source updates u: A — A’, a suitable policy is to be added to the 
view definition (see [IJ14]12] for details and discussion). Moreover, normally get 
preserves identity updates, get(id4) = idget(4), and update composition: for any 
u: A— A’ and w: A’ > A”, equality get(u; u’) = get(u); get(u’) holds. 
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2.4 Delta lenses 


Our discussion of the example can be summarized in the following algebraic 
terms. We have two categories of models and updates, A and B, and a functor 
get: A > B incrementally computing B-views of A-models (we will often write 
A.get for get(A)). We also suppose that for a chosen update policy, we have 
worked out precise procedures for how to propagate any view update backwards. 
This gives us a family of operations put,: A(A, _) < B(A.get, _) indexed by 
A-objects, A € Ao, for which we write put,.v or put 4(v) interchangeably. 
Definition 1 (Delta Lenses ([12])) Let A, B be two categories. An (asym- 
metric delta) lens from A (the source of the lens) to B (the target) is a pair 
L = (get, put), where get: A — B is a functor and put is a family of operations 
put,: A(A, _) + B(A.get, _) indexed by objects of A, A € Ao. Given A, op- 
eration put, maps any arrow v: A.get — B’ to an arrow u: A > A’ such that 
A’.get = B’. The last condition is called (co)discrete Putget law: 


(Putget),  (put,.v).cod.gety = v.cod for all A € Ap and v € B(A.get, _) 


where get, denotes the object function of functor get. We will write a lens as an 
arrow £: A — B going in the direction of get. 


Note that family put corresponds to a chosen update policy, e.g., in terms 
of the example above, for the same view functor get, we have two families 
of put-operations, put and putt, corresponding to the two updated policies 
we discussed. These two policies determine two lenses (%* = (get, put) and 
é* = (get, putt") sharing the same get. 

Definition 2 (Well-behavedness) A (lens) equational law is an equation to 
hold for all values of two variables: A € Ap and v: A.get + T’. A lens is called 
well-behaved (wb) if the following two laws hold: 


(Stability) idą = put,.id4 get for all A € Ao 
(Putget) (put ,.v).get = v for all A € Ao and all v € B(A get, _) 


Remark 1. Stability law says that a wb lens does nothing if nothing happens on 
the target side (no actions without triggers). Putget requires consistency after 
the backward propagation is finished. Note the distinction between the Putgeto 
condition included into the very definition of a lens, and the full Putget law 
required for the wb lenses. The former is needed to ensure smooth tiling of 
put-squares (i.e., arrow squares describing application of put to a view update 
and its result) both horizontally (for sequential composition) and vertically (not 
considered in the paper). The full Putget assures true consistency as considering 
a state B’ alone does not say much about the real update and elements of B’ 
cannot be properly interpreted. The real story is specified by delta v: B > B’, 
and consistency restoration needs the full (PutGet) law as above. 


A more detailed trailer of lenses can be found in the long version [9]. 


? As shown in [6], the Putgeto condition is needed if we want to define operations 
put separately from the functor get: then we still need a function get): Ag — Bo and 
the codiscrete Putget law to ensure a reasonable behaviour of put. 
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3 Asymmetric Learning Lenses with Amendments 


We will begin with a brief motivating discussion, and then proceed with formal 
definitions 


3.1 Does Bx need categorical learning? 


Enriching delta lenses with learning capabilities has a clear practical sense for 
Bx. Having a lens (get, put): A — B and inconsistency A.get Æ B’, the idea 
of learning extends the notion of the search space and allows us to update the 
transformation itself so that the final consistency is achieved for a new transfor- 
mation get’: A.get’ = B’. For example, in the case shown in Fig.|1| disappearance 
of Mary #M in the updated view B’ can be caused by changing the view def- 
inition, which now requires to show only those employees whose experience is 
more than 5 years and hence Mary #M is to be removed from the view, while 
Mary #M’ is a new IT-employee whose experience satisfies the new definition. 
Then the update v2 can be propagated as shown in the bottom right corner of 
Fig.|1| where index par indicates a new update policy allowing for view definition 
(parameter) change. 

To manage the extended search possibilities, we parameterize the space of 
transformations as a family of mappings get,,: A — B indexed over some param- 
eter space p € P. For example, we may define the IT-view to be parameterized 
by the experience of employees shown in the view (including any experience as a 
special parameter value). Then we have two interrelated propagation operations 
that map an update B~» B’ to a parameter update p~p’ and a source update 
A~» A’. Thus, the extended search space allows for new update policies that look 
for updating the parameter as an update propagation possibility. The possibility 
to update the transformation appears to be very natural in at least two impor- 
tant Bx scenarios: a) model transformation design and b) model transformation 
evolution (cf. [21]), which necessitates the enrichment of the delta lens frame- 
work with parameterization and learning. Note that all transformations get,,, 
p € P are to be elements of the same lens, and operations put are not indexed 
by p, hence, formalization of learning by considering a family of ordinary lenses 
would not do the job. 


Categorical vs. codiscrete learning Suppose that the parameter p is itself 
a set, e.g., the set of departments forming a view can vary depending on some 
context. Then an update from p to p’ has a relational structure as discussed 
above, i.e., e: p > p’ is a relation e C pxp' specifying which departments disap- 
peared from the view and which are freshly added. This is a general phenomenon: 
as soon as parameters are structures (sets of objects or graphs of objects and 
attributes), a parameter change becomes a structured delta and the space of pa- 
rameters gives rise to a category P. The search/propagation procedure returns 
an arrow e: p — p' in this category, which updates the parameter value from 
p to p’. Hence, a general model of supervised learning should assume P to be 
a category (and we say that learning is categorical). The case of the parameter 
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space being a set is captured by considering a codiscrete category P whose only 
arrows are pairs of its objects; we call such learning codiscrete. 


3.2 Ala-lenses 


The notion of a parameterized functor (p-functor) is fundamental for ala-lenses, 
but is not a lens notion per se and is thus placed into Appendix Sect. We will 
work with its exponential (rather than equivalent product-based) formulation 
but will do uncurrying and currying back if necessary, and often using the same 
symbol for an arrow f and its uncurried version f. 


Definition 3 (ala-lenses) Let A and B be categories. An ala-lens from A 
(the source of the lens) to B (the target) is a pair 4 = (get, put) whose first 
component is a p-functor get: A E. Band the second component is a triple of 
(families of) operations put = (putiP3, puts, putse') ) indexed by pairs p € Po, 
A € Ag; arities of the operations are specified below after we introduce some 
notation. Names req (for ’request’) and upd (for ’update’) are chosen to match 
the terminology in [17]. 

Categories A, B are called model spaces, their objects are models and their 
arrows are (model) updates or deltas. Objects of P are called parameters and are 
denoted by small letters p, p’,.. rather than capital ones to avoid confusion with 
[I7], in which capital P is used for the entire parameter set. Arrows of P are 
called parameter deltas. For a parameter p € Po, we write get, for the functor 
get(p): A > B (read “get B-views of A”), and if A € Ao is a source model, 
its get,-view is denoted by get,(A) or A.get, or even A, (so that _,, becomes 
yet another notation for functor get.) Given a parameter delta e: p > p’ and 
a source model A € Ao, the model delta get(e): get,(A) — get, (A) will be 
denoted by get,(A) or eg (rather than Ae as we would like to keep capital letters 
for objects only). In the uncurried version, get,(A) is nothing but get(e, ids) 

Since get, is a natural transformation, for any delta u: A —> A’ we have 
a commutative square eg; Up = Up;ea’ (whose diagonal is get(e,u)). We will 
denote the diagonal of this square by u.get, or Ue: Ap > Ay: Thus, we use 
notation 


def 


Ap de A.get,, de get,,(A) = get(p)(A) 
de 


Ue = u.get, get. (u) 2 get(e)(u) = es; up = Up; ea’: Ap > Ay, 


(1) 


Fh 


Now we describe operations put. They all have the same indexing set Po x Ao, 
and the same domain: for any index p, A and any model delta v: A, > B’ in B, 
the value put? 4(p, A), x € {req, upd, self} is defined and unique: 


puters: p—p' isa parameter delta from p, 


(2) puts g: A— A isa model delta from A, 
putsel: B' => Aj, is a model delta from B’ 
called the amendment and denoted by v°. 
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Note that the definition of put" involves an equational dependency between 
all three operations: for all A € Ao, v € B(A.get, _), we require 
(Putget),  (put!;*.v).cod.get,, = (v; putf"").cod where p' = (put'P*.v).cod 
We will write an ala-lens as an arrow ¢ = (get, put): A 38. 
A lens is called (twice) codiscrete if categories A, B, P are codiscrete and 


P ; ; : : : 

thus get: A ——> B is a parameterized function. If only P is codiscrete, we call 
£ a codiscretely learning delta lens, while if only model spaces are codiscrete, we 
call Z a categorically learning codiscrete lens. 


Diagram in Fig. P] shows how a lens’ 
operations are interrelated. The up- 
per part shows an arrow e: p — p' 
in category P and two correspond- 
ing functors from A to B. The lower 
part is to be seen as a 3D-prism 
with visible front face AA, A), A’ and 
visible upper face AA Ap, the bot- 
tom and two back faces are invisi- 
ble, and the corresponding arrows are 
dashed. The prism denotes an alge- 
braic term: given elements are shown 
with black fill and white font while de- 
rived elements are blue (recalls being 
mechanically computed) and blank 
(double-body arrows are considered 
as “blank”). The two pairs of arrows 
originating from A and A’ are not 
blank because they denote pairs of Fig. 2: Ala-lens operations 
nodes (the UML says links) rather 
than mappings/deltas between nodes. 
Equational definitions of deltas e, u, v® 


@ are written up in the three callouts near 
them. The right back face of the prism is formed by the two vertical derived deltas 
Up = u.get, and uy, = u.get,,, and the two matching them horizontal derived 
deltas eg = get.(A) and ey = get,(A’); together they form a commutative 
square due to the naturality of get(e) as explained earlier. 


Definition 4 (Well-behavedness) An ala-lens is called well-behaved (wb) if 
the following two laws hold for all p € Po, A € Ao and v: Ap > B’: 


(Stability) if v=id4, then all three propagated updates e, u, v® are identities: 


put) (ida) =idp, put®\(id4,) = ids, puts (id4,) = ida, 
(Putget) (put,",.v).get. = v; v® where e = putyPs (v) and v® = putslf (v) 


Remark 2. Note that Remark [I] about the Putget law is again applicable. 


Example 1 (Identity lenses). Any category A gives rise to an ala-lens id a with 
the following components. The source and target spaces are equal to A, and 
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the parameter space is 1. Functor get is the identity functor and all puts are 
identities. Obviously, this lens is wb. 


Example 2 (Iso-lenses). Let ı: A — B be an isomorphism between model spaces. 
It gives rise to a wb ala-lens ¢(v): A + B with P“) = 1 = {+} as follows. Given 
any A in A and v: (A) > B’ in B, we define put, - “4(y) = 7t (v) while the 
two other put operations map v to identities. 


Example 3 (Ba lenses). Examples of wb aa-lenses modelling a Bx can be found 
in [I]: they all can be considered as ala-lenses with a trivial parameter space 1. 


Example 4 (Learners). Learners defined in [I7] are codiscretely learning codis- 
crete lenses with amendment, and as such satisfy (the amended) Putget (Remark 
(ip. Looking at the eae direction, ala-lenses are a categorification of learners 
as detailed in Fig. [B]on p. 


4 Compositionality of ala-lenses 


This section explores the compositional structure of the universe of ala-lenses; 
especially interesting is their sequential composition. We will begin with a small 
example demonstrating sequential composition of ordinary lenses and showing 
that the notion of update policy transcends individual lenses. Then we define 
sequential and parallel composition of ala-lenses (the former is much more in- 
volved than for ordinary lenses) and show that wb ala-lenses can be organized 
into an sm-category. Finally, we formalize the notion of a compositional update 
policy via the notion of a compositional bidirectional language. 


4.1 Compositionality of update policies: An example 


Fig. [3] extends the example in Fig. [I] with a new model space C whose schema 
consists of the only attribute Name, and a view of the IT-view, in which only 
employees of the ML department are to be shown. Thus, we now have two 
functors, getl: A — B and get2: B > C, and their composition Get: A — C 
(referred to as the long get). The top part of Fig. [3|shows how it works for model 
A considered above. 

Each of the two policies, policy qt (green) and policy tr (orange), in which 
person’s disappearance from the view are interpreted, resp., as quiting the com- 
pany and transitioning to a department not included into the view, is applicable 
to the new view mappings get2 and Get, thus giving us six lenses shown in Fig. 
with solid arrows; amongst them, lenses, L% and L* are obtained by applying 
policy pol to the (long) functor Get;, and we will refer to them long lenses. In 
addition, we can compose lenses of the same colour as shown in Fig. [A]by dashed 
arrows (and we can also compose lenses of different colours (4 with @Y and 4 
with 43") but we do not need them). Now an important question is how long and 
composed lenses are related: whether £?! and 4°; °l for pol € {qt,tr}, are 
equal (perhaps up to some equivalence) or different? 
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SourcemodelA  @j--------------------7 -f Se 

OIDs) Name | Expr. | Depart. im W View B(IT departments) } __ ---» |View C (MLdep,) 
OIDs | Name Dep. 
:geti #1 | John DB :get2 OIDs | Name 
#M | Mary ML #M | Mary 
-——.¢ ‘put2; 


Sales 


OIDs} Name 
#M'| Mary 


Upd. source model A 75 ‘ j Updated B’ " 
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upd. ðA w: Mary |? in IT/notML 
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sit =f Mary ML 
Ms 1> ' 

#M = # 


Fig. 3: Example cont’d: functoriality of update policies 


Fig. [3] demonstrates how the mechanisms work 
with a simple example. We begin with an update w 
of the view C that says that Mary #M left the ML 
department, and a new Mary #M’ was hired for 
ML. Policy qt interprets Mary’s disappearance as 
quiting the company, and hence this Mary doesn’t 
appear in view B’% produced by put2% nor in view 
A'T produced from B’* by put1%, and updates v%t 
and uf are written accordingly. Obviously, Mary Fig.4: Lens combination 
also does not appear in view At produced by schemas for Fig. 
the long lens’s Put*. Thus, putl%(put2%(w)) = 
Put (w), and it is easy to understand that such equality will hold for any source 
model A and any update w: C + C’ due to the nature of our two views getl 
and get2. Hence, L% = 49; 23° where £4 = (Get, Put") and 4 = (geti, puté"). 

The situation with policy tr is more interesting. Model E produced by the 
composed lens £"; ¢, and model A’ produced by the long lens £% = (Get, Put") 
are different as shown in the figure (notice the two different values for Mary’s 
department framed with red ovals in the models). Indeed, the composed lens 
has more information about the old employee Mary—it knows that Mary was 
in the IT view, and hence can propagate the update more accurately. The com- 


parison update 6% ,,: A" — Af, adds this missing information so that equality 


ut; Of, = uth holds. This is a general phenomenon: functor composition looses 
information and, in general, functor Get = get1l; get2 knows less than the pair 


(getl, get2). Hence, operation Put back-propagating updates over Get (we will 
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also say inverting Get) will, in general, result in less certain models than com- 
position putl o put2 that inverts the composition get1l; get2 (a discussion and 
examples of this phenomenon in the context of vertical composition of updates 
can be found in [8]). Hence, comparison updates such as ôt{ „ should exist for any 
A and any w: A.Get — C’, and together they should give rise to something like 
a natural transformation between lenses, 0% g cœ: £" => C7; 45. To make this no- 
tion precise, we need a notion of natural transformation between “functors” put, 
which we leave for future work. In the present paper, we will consider policies 
like qt, for which strict equality holds. 


4.2 Sequential composition of ala-lenses 


Let k: A > B and £: B > C be two ala-lenses with parameterized functors 
gett: P > [A, B] and get“: Q > [B, C] resp. Their composition is the following 
ala-lens k;@. Its parameter space is the product P x Q, and the get-family is 
defined as follows. For any pair of parameters (p,q) (we will write pq), get it = 
get; get! A — C. Given a pair of parameter deltas, e: p > p' in P and h: q > q' 
in Q, their get*“-image is the Godement product * of natural transformations, 
get& (eh) = get*(e) « get (h) ( we will also write get£ || get’ ) 


Fig. 5: Sequential composition of apa-lenses 


Now we define K; ’s propagation operations puts. Let (A, pq, Ap,) with A € 
Ao, pq € (P xQ)o, A.get{ gett = Áp € Co be a state of lens K; l, and w: Apg > 
C’ is a target update as shown in Fig. |3| For the first propagation step, we run 
lens £as shown in Fig. [3] with the blue colour for derived elements: this is just an 
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instantiation of the pattern of Fig. P iii the source object being A, = A.get, 
and parameter g. The results are deltas 
(3) 


h = put) (w qd, v = putas (w ): Ap > B’,w® = pute st (w w): © > By. 


Next we run lens & at state (p, A) and the target update v produced by lens &; it 
is yet another instantiation of pattern in Fig. |2| (this time with the green colour 
for derived elements), which produces three deltas 

(4) 


e= pie (v): pop ju= put (vu): A> A'u? = put" (v): BY = Ap- 


These data specify the green prism adjoint to the blue prism: the edge v of the 
latter is the “first half” of the right back face diagonal A, Ap , of the former. In 
order to make an instance of the pattern in Fig. B]for lens a "i we need to extend 
the blue-green diagram to a triangle prism by filling-in the corresponding “empty 
space”. These filling-in arrows are provided by functors get“ and gett and shown 
in orange (where we have chosen one of the two equivalent ways of forming the 
Godement product — note two curve brown arrows). In this way we obtain yet 
another instantiation of the pattern in Fig. [2| denoted by K; £: 


k;£)upd k;0)re (k;£)self 
(5) put "P4(w) = (eh), put 29) =u, put e)**"(w) = w®; v8 


where vg denotes v? gety. Thus, we built an ala-lens K; £, which satisfies equa- 
tion Putgety by construction. 


Theorem 1 (Sequential composition and lens laws). Given ala-lenses 
k: A > B and £ B > C, let lens k;@: A —> C be their sequential composi- 
tion as defined above. Then the lens k;€ is wb as soon as lenses K and £L are 
such. 


See [9] Appendix A.3] for a proof. 


4.3 Parallel composition of ala-lenses 


Let ¢;: A; > B;, i = 1,2 be two ala-lenses with parameter spaces P;. The lens 
|): Ay x Az + Bi xBz is defined as follows. Parameter space 41||l2.P = P, x 
Lilll2 __ 
pillp2 
pairs of parameters by pı|p2 rather than pı @ p2 to shorten long formulas going 


beyond the page width). Further, for any pair of models A;|A2 € (Ay x Ag)o 


£,|2 
and deltas v1 |\V2: (A142). geto 2 


Lle d 
e put ePi a (ui lv2): palpo > pi ph 


P>. For any pair p;|p2 € (P1xP2)o, define get = geth x geti (we denote 


> Bi || B}, we define componentwise 


(£1 ||é2)req 


by setting e = e; |e2 where e; = puts s, (vi), i = 1,2 and similarly for Put, Ipa Ail As 


and pit a a, Lhe following result is obvious. 
Theorem 2 (Parallel composition and lens laws). Lens ¢,||¢2 is wb as soon 
as lenses lı and lz are such. 
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4.4 Symmetric monoidal structure over ala-lenses 


Our goal is to organize ala-lenses into an sm-category. To make sequential compo- 
sition of ala-lenses associative, we need to consider them up to some equivalence 
(indeed, Cartesian product is not strictly associative). 


Definition 5 (Ala-lens Equivalence) Two parallel ala-lenses £, A+B 
are called equivalent if their parameter spaces are isomorphic via a functor i: P > 
Ê such that for any A € Ao, e: p > p' € P and v: (A.get,) > T the following 
holds (for x€{req, self}): 


A.get, = A.get,(.); u(putsPs (v)) = put, (p) a (V), and put? a (v) = put;(p),a(v) 


Remark 3. It would be more categorical to require delta isomorphisms (i.e., com- 
mutative squares whose horizontal edges are isomorphisms) rather than equali- 
ties as above. However, model spaces appearing in Bx-practice are skeletal cat- 
egories (and even stronger than skeletal in the sense that all isos, including iso 
loops, are identities), for which isos become equalities so that the generality 


would degenerate into equality anyway. 


It is easy to see that operations of lens’ sequential and parallel composition 
are compatible with lens’ equivalence and hence are well-defined for equivalence 
classes. Below we identify lenses with their equivalence classes by default. 


Theorem 3 (Ala-lenses form an sm-category). Operations of sequential 
and parallel composition of ala-lenses defined above give rise to an sm-category 
aLaLens, whose objects are model spaces (= categories) and arrows are (equiv- 
alence classes of) ala-lenses. See [9] p.17 and Appendix A.4] for a proof. 


4.5 Functoriality of learning in the delta lens setting 


As example in Sect. [4-1]shows, the notion of update policy transcends individual 
lenses. Hence, its proper formalization needs considering the entire category of 
ala-lenses and functoriality of a suitable mapping. 


Definition 6 (Bx-transformation language) 
A compositional bidirectional model transforma- 
tion language L»x is given by (i) an sm-category 
pGet (Lx) whose objects are (Lp,-)model spaces E 
and arrows are (Lbx-)transformations which is La 

. . i bx 
supplied with forgetful functor into pCat, and pGet (Lix) aLaLens 


aLaLensyp 


(ii) an sm-functor Ly: pGet(L,,) > aLaLens À < 
such that the lower triangle in the inset diagram T Q 
commutes. (Forgetful functors in this diagram j pCat 

are named “—X” with X referring to the 


structure to be forgotten.) 
An Lpolanguage is well-behaved (wb) if functor Ly, factorizes as shown by 
the upper triangle of the diagram. 
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Example. A major compositionality result of Fong et al [I7] states the existence 
of an sm-functor from the category of Euclidean spaces and parameterized dif- 
ferentiable functions (pd-functions) Para into the category Learn of learning 
algorithms (learners) as shown by the inset commutative diagram. (The functor 
is itself parameterized by a step size 0 < € € R and 
an error function err: RxR — R needed to specify Para —————> Learn 
the gradient descent procedure.) However, learners are 
nothing but codiscrete ala-lenses (see Sect. [A.2}, and $ 8 
thus the inset diagram is a codiscrete specialization of pSet 
the diagram in Def. [6] above. That is, the category of 
Euclidean spaces and pd-functions, and the gradient 
descent method for back propagation, give rise to a (codiscrete) compositional 
bx-transformation language (over pSet rather than pCat). 

Finding a specifically Bx instance of Def. |6| (e.g., checking whether it holds 
for concrete languages and tools such as EMOFLON [23] or GROUNDTRAM [22]) 
is laborious and left for future work. 


5 Related work 

eae the right is a simplified version of Fig. Paraet Space 
on p. convenient for our discussion here: imme- 
diate related work should be found in areas located Ane >® earning 
at points (0,1) (codiscrete learning lenses) and (1,0) (ames) cy 
(delta lenses) of the plane. For the point (0,1), the pa- 0 tg 

per [I7] by Fong, Spivak and Tuyéras is fundamental: delta yoon 
they defined the notion of a codiscrete learning lens Uenses lenses) PA 
(called a learner), proved a fundamental results about Fig. 6 
sm-functoriality of the gradient descent approach to 

ML, and thus laid a foundation for the compositional approach to change prop- 
agation with learning. One follow-up of that work is paper [I6] by Fong and 
Johnson, in which they build an sm-functor Learn — sLens which maps learn- 
ers to so called symmetric lenses. That paper is probably the first one where 
the terms ‘lens’ and ’learner’ are met, but the initial observation that a learner 
whose parameter set is a singleton is actually a lens is due to Jules Hedges, see 


[16]. 

There are conceptual and technical distinctions between [I6] and the present 
paper. On the conceptual level, by encoding learners as symmetric lenses, they 
“hide” learning inside the lens framework and make it a technical rather than 
conceptual idea. In contrast, we consider parameterization and supervised learn- 
ing as a fundamental idea and a first-class citizen for the lens framework, which 
grants creation of a new species of lenses. Moreover, while an ordinary lens is a 
way to invert a functor, a learning lens is a way to invert a parameterized func- 
tor so that learning lenses appear as an extension of the parameterization idea 
from functors to lenses. (This approach can probably be specified formally by 
treating parameterization as a suitably defined functorial construction.) Besides 
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technical advantages (working with asymmetric lenses is simpler), our asymmet- 
ric model seems more adequate to the problem of learning functions rather than 
relations. On the technical level, the lens framework we develop in the paper 
is much more general than in [16|: we categorificated both the parameter space 
and model spaces, and we work with lenses with amendment (which allows us 
to relax the Putget law if needed). 

As for the delta lens roots (the point (1,0) in the figure), delta lenses were 
motivated and formally defined in [12] (the asymmetric case) and [I3] (the sym- 
metric one). Categorical foundations for the delta lens theory were developed 
by Johnson and Rosebrugh in a series of papers (see [20] for references); this 
line is continued in Clarke’s work [6]. The notion of a delta lens with amend- 
ments (in both asymmetric and symmetric variants) was defined in [I], and 
several composition results were proved. Another extensive body of work within 
the delta-based area is modelling and implementing model transformations with 
triple-graph grammars (TGG) [4/23]. TGG provide an implementation frame- 
work for delta lenses as is shown and discussed in [5J19[2], and thus inevitably 
consider change propagation on a much more concrete level than lenses. The 
author is not aware of any work considering functoriality of update policies 
developed within the TGG framework. 

The present paper is probably the first one at the intersection (1,1) of the 
plane. The preliminary results have recently been reported at ACT’19 in Oxford 
to a representative lens community, and no references besides [17], [L6] mentioned 
above were provided. 


6 Conclusion 


The perspective on Bx presented in the paper is an example of a fruitful in- 
teraction between two domains—ML and Bx. In order to be ported to Bx, the 
compositional approach to ML developed in [I7] is to be categorificated as shown 
in Fig. B]on p. This opens a whole new program for Bx: checking that cur- 
rently existing Bx languages and tools are compositional (and well-behaved) in 
the sense of Def. b]p. The wb compositionality is an important practical 
requirement as it allows for modular design and testing of bidirectional trans- 
formations. Surprisingly, but this important requirement has been missing from 
the agenda of the Bx community, e.g., the recent endeavour of developing an 
effective benchmark for Bx-tools [3] does not discuss it. 

In a wider context, the main message of the paper is that the learning idea 
transcends its applications in ML: it is applicable and usable in many domains in 
which lenses are applicable such as model transformations, data migration, and 
open games [18]. Moreover, the categorificated learning may perhaps find useful 
applications in ML itself. In the current ML setting, the object to be learnt is 
a function f: R” — R” that, in the OO class modelling perspective, is a very 
simple structure: it can be seen as one object with a (huge) amount of attributes, 
or, perhaps, a predefined set of objects, which is not allowed to be changed during 
the search — only attribute values may be changed. In the delta lens view, 
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such changes constitute a rather narrow class of updates and thus unjustifiably 
narrow the search space. Learning with the possibility to change dimensions 
m,n may be an appropriate option in several contexts. On the other hand, while 
categorification of model spaces extends the search space, categorification of the 
parameter space would narrow the search space as we are allowed to replace 
a parameter p by parameter p’ only if there is a suitable arrow e: p + p’ in 
category P. This narrowing may, perhaps, improve performance. All in all, the 
interaction between ML and Bx could be bidirectional! 


A Appendices 
A.1 Category of parameterized functors pCat 


Category pCat has all small categories as objects. pCat-arrows A — B are 
parameterized functors (p-functors) i.e., functors f: P — [A,B] with P a small 
category of parameters and [A,B] the category of functors from A to B and 
their natural transformations. For an object p and an arrow e: p > p’ in P, 
we write fp for the functor f(p): A > B and fe for the natural transformation 


f(e): fp => fp. We will write p-functors as labelled arrows f: A P, B. As Cat 
is Cartesian closed, we have a natural isomorphism between Cat(P,[A,B]) and 
Cat(PxA,B) and can reformulate the above definition in an equivalent way 
with functors Px A — B. We prefer the former formulation as it corresponds to 


P 
the notation f: A —> B visualizing P as a hidden state of the transformation, 
which seems adequate to the intuition of parameterized in our context. (If some 
technicalities may perhaps be easier to see with the product formulation, we will 
switch to the product view thus doing currying and uncurrying without special 
P 
mentioning.) Sequential composition of of f: A ——> B and g: B Ê, Cis 
P 
fg: A Px c given by (f.9)pq = fp-9q for objects, i.e., pairs peP, qEQ, and 
by the Godement product of natural transformations for arrows in PxQ. That 
is, given a pair e: p > p' in P and h: q > q' in Q, we define the transformation 
(f.g)en: fp-Jq = fp’-Gq to be the Godement product fe * gn. 


Any category A gives rise to a p-functor Ida: A as A, whose param- 
eter space is a singleton category 1 with the only object x, Ida(*) = ida 
and Ida(id.): ida = ida is the identity transformation. It’s easy to see that 
p-functors Id are units of the sequential composition. To ensure associativ- 
ity we need to consider p-functors up to an equivalence of their parameter 


spaces. Two parallel p-functors f: A -P.B and f: E B, are equiv- 
alent if there is an isomorphism a: P —> Ê such that two parallel functors 
f: P —> [A,B] and a; f: P > [A,B] are naturally isomorphic; then we write 
fR f. It’s easy to see that if f Sa f: A — Band g ~g g: B > C, then 
f; 9 %axp f ;g: A > C, i.e., sequential composition is stable under equivalence. 
Below we will identify p-functors and their equivalence classes. Using a natu- 
ral isomorphism (PxQ)xR = Px(QxR), strict associativity of the functor 
composition and strict associativity of the Godement product, we conclude that 
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sequential composition of (equivalence classes of) p-functors is strictly associa- 
tive. Hence, pCat is a category. 


Our next goal is to supply it with a monoidal pCat «—— pSet 
structure. We borrow the latter from the sm- 
category (Cat, x), whose tensor is given by the prod- l | 
uct. There is an identical on objects embedding (Cat,x) -— (Set,x) 


(Cat,x) -—+ pCat that maps a functor f: A > B 
to a p-functor f: A —. B whose parameter space mae 

is the singleton category 1. Moreover, as this embedding is a functor, the co- 
herence equations for the associators and unitors that hold in (Cat,x) hold in 
pCat as well (this proof idea is borrowed from [I7]). In this way, pCat becomes 
an sm-category. In a similar way, we define the sm-category pSet of small sets 
and parametrized functions between them — the codiscrete version of pCat. The 
diagram in Fig. [7]shows how these categories are related. 


A.2  Ala-lenses as categorification of ML-learners 


Figure |8| shows a discrete two-dimensional plane with each axis having three 
points: a space is a singleton, a set, a category encoded by coordinates 0,1,2 
resp. Each of the points x;; is then the location of a corresponding sm-category of 


Parameter categorical learning delta 
space learners | ; lenses with amend. 
PE Cat ©- {1} ‘======== aLLens* ----aLaLens 


learners of } i codiscr. learning delta 
i Fong et al j: lenses with amend. 
PECat* (1)- {4} ------ aL*Lens* .--- aL*aLens 
codiscr. | : ! (delta lenses 
lenses ' ! | with amend. 
pa1@- {I} Lens’. aber 


(> > 
wW V 2) Model 
A,B=1 A,B €Cat* A,B ECat spaces 


Fig. 8: The universe of categories of learning delta lenses 


(asymmetric) learning (delta) lenses. Category {1} is a terminal category whose 
only arrow is the identity lens 1 = (id1,id1): 1 — 1 propagating from a terminal 
category 1 to itself. Label * refers to the codiscrete specialization of the construct 
being labelled: L* means codiscrete learning (i.e., the parameter space P is a 
set considered as a codiscrete category) and aLens* refers to codiscrete model 
spaces. The category of learners defined in [I7] is located at point (1,1), and the 
category of learning delta lenses with amendments defined in the present paper 
is located at (2,2). There are also two semi-categorificated species of learning 
lenses: categorical learners at point (1,2) and codiscretely learning delta lenses 
at (2,1), which are special cases of ala-lenses. 
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Abstract. Intersection types are an essential tool in the analysis of oper- 
ational and denotational properties of lambda-terms and functional pro- 
grams. Among them, non-idempotent intersection types provide precise 
quantitative information about the evaluation of terms and programs. 
However, unlike simple or second-order types, intersection types cannot 
be considered as a logical system because the application rule (or the 
intersection rule, depending on the presentation of the system) involves 
a condition stipulating that the proofs of premises must have the same 
structure. Using earlier work introducing an indexed version of Linear 
Logic, we show that non-idempotent typing can be given a logical form 
in a system where formulas represent hereditarily indexed families of 
intersection types. 


Keywords: Lambda Calculus - Denotational Semantics - Intersection 
Types - Linear Logic 


Introduction 


Intersection types, introduced in the work of Coppo and Dezani [4,5] and de- 
veloped since then by many authors, are still a very active research topic. As 
quite clearly explained in [13], the Coppo and Dezani intersection type system 
DN can be understood as a syntactic presentation of the denotational interpre- 
tation of \-terms in the Engeler’s model, which is a model of the pure \-calculus 
in the cartesian closed category of prime-algebraic complete lattices and Scott 
continuous functions. 

Intersection types can be considered as formulas of the propositional calculus 
with implication = and conjunction ^ as connectives. However, as pointed out 
by Hindley [12], intersection types deduction rules depart drastically from the 
standard logical rules of intuitionistic logic (and of any standard logical system) 
by the fact that, in the /A-introduction rule, it is assumed that the proofs of the 
two premises are typings of the same A-term, which means that, in some sense 
made precise by the typing system itself, they have the same structure. Such 
requirements on proofs premises, and not only on formulas proven in premises, 
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are absent from standard (intuitionistic or classical) logical systems where the 
proofs of premises are completely independent from each other. Many authors 
have addressed this issue, we refer to [14] for a discussion on several solutions 
which mainly focus on the design of à la Church presentations of intersection typ- 
ing systems, thus enriching A-terms with additional structures. Among the most 
recent and convincing contributions to this line of research we should certainly 
mention [15]. 

In our “new” approach to this problem — not so new actually since it dates 
back to [3] —, we change formulas instead of changing terms. It is based on a 
specific model of Linear Logic (and thus of the A-calculus): the relational model. 
It is fair to credit Girard for the introduction of this model since it appears at 
least implicitly in [11]. It was probably known by many people in the Linear 
Logic community as a piece of folklore since the early 1990’s and is presented 
formally in [3]. In this quite simple and canonical denotational model, types 
are interpreted as sets (without any additional structure) and a closed term 
of type ø is interpreted as a subset of the interpretation of ø. It is quite easy 
to define, in this semantic framework, analogues of the usual models of the 
pure A-calculus such as Scott’s D. or Engeler’s model, which in some sense 
are simpler than the original ones since the sets interpreting types need not to 
be pre-ordered. As explained in the work of De Carvalho [6,7], the intersection 
type counterpart of this semantics is a typing system where “intersection” is non- 
idempotent (in sharp contrast with the original systems introduced by Coppo 
and Dezani), sometimes called system R. Notice that the precise connection 
between the idempotent and non-idempotent approaches is analyzed in [8], in a 
quite general Linear Logic setting by means of an extensional collapse. 

In order to explain our approach, we restrict first to simple types, interpreted 
as follows in the relational model: a basic type a is interpreted as a given set [a] 
and the type o > 7 is interpreted as the set Mgn({o]) x [7] (where Mgn(£) is 
the set of finite multisets of elements of E). Remember indeed that intersection 
types can be considered as a syntactic presentation of denotational semantics, so 
it makes sense to define intersection types relative to simple types (in the spirit 
of [10]) as we do in Section 3: an intersection type relative to the base type a is an 
element of [a] and an intersection type relative to o > T isa pair ([a1,...,@n |, b) 
where the a;s are intersection types relative to ø and b is an intersection type 
relative to 7; with more usual notations! ([a1,...,@n],b) would be written (a1 A 
-++A ay) — b. Then, given a type ø, the main idea consists in representing 
an indexed family of elements of [co] as a formula of a new logical system. If 
a = (p = Y) then the family can be written? ([ap | k € K and u(k) = j],b;)je7 
where J and K are indexing sets, u : K > J is a function such that f~'({j}) is 
finite for all j € J, (bj)j;c7 is a family of elements of [yY] (represented by a formula 
B) and (ax)xex is a family of elements of [y] (represented by a formula A): in 
that case we introduce the implicative formula (A =>, B) to represent the family 


1 That we prefer not to use for avoiding confusions between these two levels of typing. 
? We use [---] for denoting multisets much as one uses {---} for denoting sets, the 
only difference is that multiplicities are taken into account. 
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({ax |k € K and u(k) = 7],b;)je,7. It is clear that a family of simple types has 
generally infinitely many representations as such formulas; this huge redundancy 
makes it possible to establish a tight link between inhabitation of intersection 
types with provability of formulas representing them (in an indexed version LJ(J) 
of intuitionistic logic). Such a correspondence is exhibited in Section 3 in the 
simply typed setting and the idea is quite simple: 


given a type o, a family (aj)jez of elements of [øo], and a closed \-term 
of type øg, it is equivalent to say that H} M : aj holds for all j and to 
say that some (and actually any) formula A representing (a;j)jej has an 
LJ(I) proof? whose underlying A-term is M. 


In Section 4 we extend this approach to the untyped A-calculus taking as 
underlying model of the pure A-calculus our relational version Ræ of Scott’s Dx. 
We define an adapted version of LJ(Z) and establish a similar correspondence, 
with some slight modifications due to the specificities of Roo. 


1 Notations and preliminary definitions 


If E is a set, a finite multiset of elements of E is a function m : E — N such 
that the set {a € E | m(a) 4 0} (called the domain of m) is finite. The cardinal 
of such a multiset m is ##m = J acg mM(a). We use + for the obvious addition 
operation on multisets, and if a1,...,@, are elements of E, we use [a1,...,@n | 
for the corresponding multiset (taking multiplicities into account); for instance 
[0,1,0,2,1] is the multiset m of elements of N such that m(0) = 2, m(1) = 2, 
m(2) = 1 and m(i) = 0 for i > 2. If (a;)ie7 is a family of elements of E and if J 
is a finite subset of J, we use [ a; | i € J] for the multiset of elements of E which 
maps a € E to the number of elements i € J such that a; = a (which is finite 
since J is). We use Mg,(£) for the set of finite multisets of elements of F. 

We use + to denote set union when we we want to stress the fact that the 
involved sets are disjoint. A function u : J > K is almost injective if #u~t{k} 
is finite for each k € K (equivalently, the inverse image of any finite subset of 
K under u is finite). If s = (a1,...,an) is a sequence of elements of E and 
i € {1,...,n}, we use (s) \ é for the sequence (a1,...,@i-1, @i41,---,@n). Given 
sets E and F, we use F¥ for the set of function from E to F. The elements of 
F? are sometimes considered as functions u (with a functional notation u(e) for 
application) and sometimes as indexed families a (with index notations ae for 
application) especially when Æ is countable. 

Ifi € {1,...,n} and j € {1,...,n — 1}, we define s(j,i) € {1,...,n} as 
follows: s(j,i) = j if j < i and s(j,i) =j + 1if j >i. 


3 Any such proof can be stripped from its indexing data giving rise to a proof of ø in 
intuitionistic logic. 
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2 The relational model of the A-calculus 


Let Rel; the category whose objects are setst and Rel)(X,Y) = P(Mgn(X) x Y) 
with Idx = {({a],a) | a E€ X} and composition of s € Reli(X,Y) and t € 
Rel, (Y, Z) given by 


to s= {(mz+---+ mg, c) | 
3b1,... bk E Y ([b1,..-, bk], c) € t and Vj (mj, bj) E€ s}. 


It is easily checked that this composition law is associative and that Id is neutral 
for composition”. This category has all countable products: let (Xj)jey be a 
countable family of sets, their product is X = &jey Xj = U;esti} x Xj and 
projections (pr;) je given by pr; = {([(j,@)],a) | a € Xj} € Rel)(X, X;) and if 
(s;)je7 is a family of morphisms s; € Rel) (Y, X;) then their tupling is (s;)j;e7 = 
{({a], (j,8))) | j € J and ((a],b) € sj} € Reli(¥, X). 

The category Rel, is cartesian closed with object of morphisms from X to Y 
the set (X > Y) = Mgn(X)xY and evaluation morphism Ev € Reli((X > Y) & 
X,Y) is given by Ev = {([(1,[@1,...,@%], b), (2,a1),.--,(2, ax) ],b) | a1,..-, aK € 
X and b € Y}. The transpose (or curryfication) of s € Reli(Z & X,Y) is 
Cur(s) € Reli(Z,X = Y) given by Cur(s) = {([c1,---,¢n], ([ai,---, ax ],9)) | 
([(1,¢1),---, (1, en), (2, a1),-.-, (2, ax) ], 0) E€ s}. 


Relational Dao. Let Rx be the least set such that (mo, mı, ...) € Roo as soon 
as Mo, Mı... are finite multisets of elements of Rœ which are almost all equal 
to []. Notice in particular that e = ([ ],[],...) € Roo and satisfies e = ([ ], e). 
By construction we have Ry = Mén(Roo) x Roo, that is Ro = (Ro => Ro) 
and hence Rə is a model of the pure A-calculus in Rel; which also satisfies the 
n-rule. See [1] for general facts on this kind of model. 


3 The simply typed case 


We assume to be given a set of type atoms a,(@,... and of variables z,y,...; 
types and terms are given as usual by 0,7,...:=a|o=>7 and M,N,...:=2 | 
(M)N | Ax? N. 


With any type atom we associate a set |a]. This interpretation is extended to 
all types by [o > T] = [o] > [7] = Man([o]) x [7]. The relational semantics of 
this A-calculus can be described as a non-idempotent intersection type system, 
with judgments of shape x1 : M1 : C1, ..., En : Mn : On F M:a:o where the x;’s 
are pairwise distinct variables, M is a term, a € [o] and m; E€ Mgn(oi]) for 
each i. Here are the typing rules: 


j i= mj = |] and m; = [a] @,a:m:aF M:ib:t 
(aj: mz: oj), traro BE Ax? M: (m,b):0 >T 
4 We can restrict to countable sets. 


5 This results from the fact that Rel; arises as the Kleisli category of the LL model of 
sets and relations, see [3] for instance. 
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BE M: (la... ak], b) :0 >T (SLF N:ia:o)h, 
WE(M)N:b: 7 


where © = (x; : mi : oi) 1, Bı = (xi : ml : o:)%, for | = 1,...,k and 


l: 
= (hmt Emira Jien 


3.1 Why do we need another system? 


The trouble with this deduction system is that it cannot be considered as the 
term decorated version of an underlying “logical system for intersection types” 
allowing to prove sequents of shape Mı : 04,...,Mn : On F a: o (where non- 
idempotent intersection types m; and a are considered as logical formulas, the 
ordinary types o; playing the role of “kinds”) because, in the application rule 
above, it is required that all the proofs of the & right hand side premises have the 
same shape given by the \-term N. We propose now a “logical system” derived 
from [3] which, in some sense, solves this issue. The main idea is quite simple and 
relies on three principles: (1) replace hereditarily multisets with indexed families 
in intersection types, (2) instead of proving single types, prove indexed families 
of hereditarily indexed types and (3) represent syntactically such families (of 
hereditarily indexed types) as formulas of a new system of indexed logic. 


3.2 Minimal LJ(I) 


We define now the syntax of indexed formulas. Assume to be given an infinite 
countable set I of indices. Then we define indexed types A; with each such type 
we associate an underlying type A, a set d(A) and a family (A) € [A]*“). These 
formulas are given by the following inductive definition: 


— if JC I and f : J > [a] is a function then a[f] is a formula with a[f] = a, 
d(a[f]) = J and (a[f]) = f p 

— and if A and B are formulas and u : d(A) > d(B) is almost injective then 
A=, B is a formula with A >u B = A > B, d(A =u B) = d(B) and, for 
k € d(B), (A =u B)x = ([ (A); | j € aA) and u(j) = k], (B)x). 


Proposition 1. Let o be a type, J be a subset of I and f € [o]’. There is a 
formula A such that A = o, d(A) = J and (A) = f (actually, there are infinitely 


many such A’s as soon as o is not an atom and J # 0). 


Proof. The proof is by induction on ø. If ø is an atom a then we take A = a[f]. 
Assume that o = (p > T) so that f(j) = (mj, bj) with m; E€ Mgn([p]) and 
b; € [r]. Since each m; is finite and J is infinite, we can find a family (K;)je 7 of 
pairwise disjoint finite subsets of J such that #K; = #m,. Let K = Uses Ri; 
there is a function g : K —> [p] such that m; = [g(k) | k € K; ] for each j € J 
(choose first an enumeration gj : K; — [p] of m; for each j and then define 
g(k) = gj(k) where j is the unique element of J such that k € K;). Let u : K > J 
be the unique function such that k € K,,,) for all k € K; since each K; is finite, 
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this function u is almost injective. By inductive hypothesis there is a formula A 
such that A = p, d(A) = K and (A) = g, and there is a formula B such that 
B =r, d(B) = J and (B) = (b;);e7. Then the formula A >, B is well formed 
(since u is an almost injective function d(A) = K — d(B) = J) and satisfies 
A=, B=0,d(A>, B) = J and (A=, B) = f as contended. 


As a consequence, for any type o and any element a of |ø] (so a is a non- 
idempotent intersection type of kind ø), one can find a formula A such that 
A=o,d(A) = {j} (where j is an arbitrary element of I) and (A); = a. In other 
word, any intersection type can be represented as a formula (in infinitely many 
different ways in general of course, but up to renaming of indices, that is, up to 
“hereditary a-equivalence”, this representation is unique). 

For any formula A and J C J, we define a formula Aly such that Aly = A, 
d(A};) =d(A)N J and (Aty) = (A) îy. The definition is by induction on A. 


— alfltz = alf ta] 
— (A=, B)ty = (Alx =. Bly) where K = u`! (d(B) N J) and v = u [x. 


Let wu: d(A) > J be a bijection (so that u(d(A)) = J), we define a formula 
ux(A) such that u, (A) = A, d(u.(A)) = u(d(A)) and (u.(A)); = (A)u-1(j)- The 
definition is by induction on A: 


— ux(a[f]) = alf ou] 
Ux(A >, B) = (A =>uov Ux(B)). 


Using these two auxiliary notions, we can give a set of three deduction rules 
for a minimal natural deduction allowing to prove formulas in this indexed intu- 
itionistic logic. This logical system allows to derive sequents which are of shape 


A“... AHB (1) 


where for each i = 1,...,n, the function u; : d(A;) > d(B) is almost injective (it 
is not required that d(B) = Uj_, ui(d(A;))). Notice that the expressions A; are 
not formulas; this construction A“ is part of the syntax of sequents, just as the “,” 
separating these pseudo-formulas. Given a formula A and u : d(A) > J almost 
injective, it is nevertheless convenient to define (A“) € Mgn([A])7 by (A“); = 
| (A)r | u(k) = 7]. In particular, when u is a bijection, (AY); = [(A)u-1() J. 

The crucial point here is that such a sequent (1) involves no \-term. 

The main difference between the original system LL(J) of [3] and the present 
system is the way axioms are dealt with. In LL(/) there is no explicit identity 
axiom and only “atomic axioms” restricted to the basic constants of LL; indeed 
it is well-known that in LL all identity axioms can be 7-expanded, leading to 
proofs using only such atomic axioms. In the A-calculus, and especially in the 
untyped A-calculus we want to deal with in next sections, such 7-expansions are 
hard to handle so we prefer to use explicit identity axioms. 

The axiom is 


j #t=>d(A;) =O and u; is a bijection 
Ai’, Sa , Abn H Uis (Aj) 
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so that for j Æ i, the function uj is empty. A special case is 
j #1= d(A;) = 0 and u; is the identity function 
Ay’,..., AY” F A; 
which may look more familiar, but the general axiom rule, allowing to “delocalize” 


the proven formula A; by an arbitrary bijection u;, is required as we shall see. 
The => introduction rule is quite simple 


At, ..., Au, AYE B 


nm? 


AS, A F AS, B 


Last the = elimination rule is more complicated (from a Linear Logic point 
of view, this is due to the fact that it combines 3 LL logical rules: —o elimination, 
contraction and promotion). We have the deduction 

Cy?,...,Cum FAs, B Dy jessy Der FA 
EY?,..., BM B 


under the following conditions, to be satisfied by the involved formulas and 
functions: for each i = 1,...,m one has d(C;)Md(D;) = 0, d(£;) = d(C;) +d(Di), 
Ci = Eilao Di = Eita) Wi acci) = Ui, and wi [a(p,)= U o vi. 

Let m be a deduction tree of the sequent AJ’, ..., AY” H B in this system. 
By dropping all index information we obtain a derivation tree m of A1,..., An F 
B, and, upon choosing a sequence ofn pairwise distinct variables, we can 
associate with this derivation tree a simply typed A-term m which satisfies 
£1: Areata An hia i B. 


3.3 Basic properties of LJ(I) 


We prove some basic properties of this logical system. This is also the opportunity 
to get some acquaintance with it. Notice that in many places we drop the type 
annotations of variables in A-terms, first because they are easy to recover, and 
second because the very same results and proofs are also valid in the untyped 
setting of Section 4. 


Lemma 1 (Weakening). Assume that B- A is provable by a proof n and let 
B be a formula such that d(B) = 9. Then P + A is provable by a proof x’, where 
Ð is obtained by inserting B% at any place in B. Moreover t} = T> (where 


— 
x’ is obtained from È by inserting a dummy variable at the same place). 
The proof is an easy induction on the proof of @F A. 


Lemma 2 (Relocation). Let n be a proof of (Aj')P_, + A let u : d(A) > J be 
a bijection, there is a proof m' of (Aj°"')"_, F u,(A) such that T'ẹ = a>. 

The proof is a straightforward induction on 7. 

Lemma 3 (Restriction). Let m be a proof of (A; );— F A and let J C d(A). 
Fori=1,...,n, let Ki = ut (J) C d(A;) and ui, = uik, : Ki > J. Then the 
sequent ((Ailx,)“:)"_, F Aly has a proof x’ such that T'ẹ = Tẹ. 
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Proof. By induction on 7. Assume that 7 consists of an axiom (Ai? Puy F ux (Aa) 
d(A;) = 0 if j £2, and u; a bijection. With the notations of the lemma, 
p= fai j Ż i and u; is a bijection K; > J. Moreover uj, (Ail K,) = uis (Ai) fy 
so a ((Aitx,)“)™ Aly is obtained by an axiom a’ with T'ẹ = ti = Tẹ- 

Assume that m ends with a =>-introduction rule: 

p 
Ayers 
(Ay ia i An+1 Sunyi B 


with A = (Anyi >u,,, B), and we have ty = Atn41 P> ani’ With the no- 


tations of the lemma we have Aly = (An+1 Knp ul 


hypothesis there is a proof p’ of (Ai f, Jar + Bly such that Pe a, = PP arcs 
n+ 


Bt). By inductive 


n+1 


and hence we have a proof a’ of (A; (je jr, F Afy with a’> = aoa a = 


T as contended. 
Assume last that m ends with a >-elimination rule: 


H P 
(Bi B>A (C'i B 
APF A 
with d(A;) = d(B i) + d(C), B; = = A; ay d(B;) and Ci = Aila(c:)» Ui ld(B:) = Vi 
and uilgc,) = v o wi for i = 1,...,n, and of course Tẹ = Ho ) P> Let 


L =v“! (J) C d(B). Let Li = v;-1(J) and R; = w;™!(L) for i = 1,...,n (we 
also set vi = vil £;, w; = wiĵr; and v’ = v[,). By inductive hypothesis, we have 
a proof w of (Bilt! a 1 F Bi, = Aly such that w- = u and a proof p' 
of (es + Bir such that p' = p>. Now, setting K; = u;—'(K), observe 
that 

— d(B;) A K; = Li = d(B,fz,) and uft, = v; since uila(B:) = Vi 

— d(C;) N K; = a = d(C, ae 1(L) since ujhac,;) =v o w; and L = vTt(J), 

hence d(C;) O Ki = d(Ciîr,), and also uitz, =v’ o wy. 

It follows that A = = L; + Ri, and, setting u; = uilK,, we have ulr, = v; 
and u;Ìr; = v’ o wi. Hence we have a proof 7’ of (A; rae H Af; such that 


>= (u's) p> = (u>) p— = Tẹ as contended. 
Though substitution lemmas are usually trivial, the LJ(7) substitution lemma 

requires some care in its statement and proof®. 

Lemma 4 (Substitution). Assume that (Aj? jar F A with a proof p and 

that, for some i € {1,...,n}, (BP Yr t+ A; with a proof p. Then there is a 


proof ™ of (CF j ja | A such that Tei = by Pae 4 as soon as for each 
j=l,...,n—1, d(C;) = d(Aggya)) + d( Bj) for each j =1,...,n—1 (remember 
that this requires also that d(Ag(j,4)) N d(B;) =) with: 


6 We use notations introduced in Section 1, especially for s(j, i). 


206 T. Ehrhard 


= Cila(asg.o) = Asi) and witali) = Usli) 

= C; la(B,) = B; and Wj la(B;) = Uj O Uj. 

Proof. By induction on the proof u. Assume that u is an axiom, so that there is 
a k € {1,...,n} such that A = up,(Ak), ux is a bijection and d(A j) = 9 for all 
j #k. In that case we have Ho =k. T here are two subcases to consider. Assume 


first that k = i. By Lemma 2 there is a proof p’ of (B Ji PE oo such 
that p' Pani = = Pei . We have Cj = Bj and wj = u; o v; for j = 1,...,n — 1, 


ee 


so that p’ is a proof of (C37 ae + A, so we take 7 = p' and equation m(z)\; = 


1 [p Pray /xi| holds since u, = x;. Assume next that k # i, then d(A;) = 0 
Re hence d(B;) = Ø (and v; = 09) for j = 1,...,n — 1. Therefore Cj = As(j i) 
and wj = vs(;,4) for j = 1,...,n — 1. So our ‘ae sequent (CPG i Ft A can 


Us(j,i) 


also be written (A, Gi) Ve Te eta and is provable by a proof m such that 
T(#)\i = Tk as Sontended: 
Assume now that y is a >-intro, that is A = (Anyi >u,,,, 4’) and p is 


6 
(AHHA 
(AF ear a 
We set B, = An+ilg and of course vpj41 = Og(ay- Then we have a proof p of 
Ui\n : 3 
(B;*Vj=1 F A; such that P Wian = LDV by Lemma 1. We set Cn = An+1 
and Wn = Un4y1. Then by inductive hypothesis applied to 0 we have a proof 
T° of (Cj), H A’ which satisfies 1) \i0,., = IF jens lee jl and 
applying a =>-introduction rule we get a proof m of (C a } H A such that 
Tee) \i =ATn41 (OR enpi anual) = bs i] as expected. 
Assume last that the proof u ends with 
p Y 
Siyi ti\n 
(E; jar E >, A (FF fai F E 
(47 Fa FA 


with d(A;) = d(E;) + d(F;), Ajla) = Ej, Ajla = Fj, ujta) = sj 
and ujlar,) = s ° tj, for j = 1,...,n. And we have b> = (ez) W. The 


idea is to “share” the substituting proof p of (B i iar F A; among y and 7p 
according to what they need, as specified by the formulas E; and F;. So we write 
d(B;) = L; +R; where L; = v3 +(d(E;)) and Rj = vj“! (d(F;)) and by Lemma 3 


L R 
we have two proofs p” of (B;/ L, j + E; and (BjlR, ey + F; where we set 


v7 = vjz; and vR = vj Îr}, obtained from p by restriction. These proofs satisfy 


L ay = 
ayi P ayy > Pw 
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Now we want to apply the inductive hypothesis to y and p+, in order to get 


a proof of the ae (Gus es | + E =>, A where G; = Cjlace Eyj,i))+L; (observe 
indeed that d(F4(j,1)) E d(Asc;, a) and L; C d(B;) nd hence are disjoint by our 
assumption that “iC; ) = d(Agy,ay) + d(B 7) aad w? = wilaeg, »)+L;- With 
these definitions, and by our o E about Cj and wj, we have for all 
j=l,...,n—-1 


Gy aBa) = O algo) aE) = As,a) hag.) = Esi) 
W alB) = Wi lalo) FBG.) = Y0 Fay.) = SGi) 
Gilt; = Cj lac e; = Bitr 
wF lt; = wylace,)lny = (ui o vj) lr; = Uilale) © VŽ = si o UF 


wl 
Therefore the inductive hypothesis applies yielding a proof y’ of (G a ae F 


E =, A such that y’ Yai = Pe |P [o Bx ft =@ glawa 

Next we want to apply the inductive oe to w and p®, in order to 
get a proof of the sequent (Fy Nik + E where, for j = 1,...,n — 1, Hj = 
Cj leur ))+R; (again d(Fsg,i)) C d(Asg,i)) and Rj C d(B;) are disjoint by our 
assumption that d(C;) = GAs i)) + d(B;)) and rj is defined by rjlacry,y) = 


. Remember indeed that v? : Rj > d(F;) and t; : 


ts(j,i) and Tj ÎR; = = tj 0 vE 
d(F;) > d(E). We have 


j 


Hj langa) = Cila(asg.9) Ego) = Asi Ego) = £508) 
Hj ÎR; =C; la(B;) ÎR; = B; ÎR; 
and hence by inductive hypothesis there is a proof 7’ of (H; H’ yay H E such that 
Wan a eee il =v [ee] l 
To end the proof of the lemma, it will be a to prove that we can apply 


w? guiye 
a =-elimination rule to the sequents (G;’ )}Z TF E >, A and (H Hom FE 
in order to get a proof m of the sequent (cm Hynes 1 + A. Indeed, ihe aoe T 


obtained in that way will satisfy T(?)\i = (e, on) Vay = bo een al 
Let j € {1,...,n—1}. We have Cj lai¢,) = Gj and Cjla(#,) = Hy simply because 
Gj and Hj are defined by restricting Cj. Moreover d(G;) = d(B.(;,;)) + Lj and 
d(H j= d(F. 3(j,i)) + Rj. Therefore d(G;) N d(H;) = Ø and 

d(C;) = d(Asg,i)) + d(B;) = d( Eng.) + d(Peg,a) + Lj + Rj = d(G;) + d(H). 


We have w;la(a;) = wr by definition of wy as Wj COMESA We have 


Wj la(H;) lary,y) = WI alg) dERg,y) = Ysi, bbG,2y) 
= s o tji) = (8 075) barry) 
wjlaa,)lR; = Wi lal) tR; = (ui o vj) TR, 


= uilar,) o v? = so t; o v? = s o rz, = (so rj)Ìr; 
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and therefore w;[g(#,) = $ © rj as required. 


We shall often use the two following consequences of the Substitution Lemma. 


Lemma 5. Given a proof p of (A a + A and a proof p of BY F A; (for 


some i € {1,...,n}), there is a proof n of (A A;? a =) oe (Ar i+1 C A such 


that Tẹ = p as 


Proof. By wees we have a proof u’ of (A; i Bi (Aj? Vi FA 


such that u’ Ho =H (where 7 is a list of pairwise distinct variables of 


=(#)\it1 
length n+1), as well as a proof p’ of (A; Pii Je 1 B”, (4; i ye i41 F Aj such 


that p’_, = Pos, By Lemma 4, we have a proof 7’ of (Aj? ea 1 Buv (Aj? aa F 


A which ree TÈN =H > a] = Ns le, /2i}: 


Lemma 6. Given a proof u of A” + B and a proof p of (A i H A, there is 


a 


a proof r of (A 1 B such that my = u, leo /z]- 
The proof is similar to the previous one. 

If A and B are formulas such that A = B, d(A) = d(B) and (A) = (B), we 
say that A and B are similar and we write A ~ B. One fundamental property 
of our deduction system is that two formulas which represent the same family 
of intersection types are logically equivalent. 


Theorem 1. If A~ B then A+ B with a proof m such that T, ~y x. 


Proof. Assume that A = aff], then we have B = A and A+ B is an axiom. 
Assume that A = (C >, D) and B = (E =, F). We have D ~ F and 
hence D'4 | F with a proof p such that p o n T. And there is a bijection 
w : d(E) —> d(C) such that w, (E) ~ C and u wow=v. By inductive hypothesis 
we have a proof p of w,( E)" + C such that u ~y y, and hence using the axiom 
E” + w,( E) and Lemma 5 we have a proof p of E” H C such that p’ = H, 
There is a proof 7? of (C >, D)", C” H D such that zt, , = (x) y (consider 


the two axioms (C >, D)", Cf H C =>, D and (CS. D DIYO, Ce 
and use a =-elimination rule). So by Lemma 5 there is a Brook a of (C >u 
D)“, Ee! H D, that is of (C =, D)", E” F D, such that 27,4 = (x) u 
Applying Lemma 6 we get a proof 1° of (C >, D)", E” + F such that 7°, y= 


P, [e B; fal: We get the expected proof m by a =-introduction rule so that 


Te = AYP, [() uel: By inductive hypothesis T, ~y T. 
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3.4 Relation between intersection types and LJ(I) 


Now we explain the precise connection between non-idempotent intersection 
types and our logical system LJ(J). This connection consists of two statements: 


— the first one means that any proof of LJ(I) can be seen as a typing derivation 
in non-idempotent intersection types (soundness) 

— and the second one means that any non-idempotent intersection typing can 
be seen as a derivation in LJ(J) (completeness). 


Theorem 2 (Soundness). Let 7 be a deduction tree of the sequent (Aj")P_) F 
B and? a sequence of n pairwise distinct variables. Then the -term m= sat- 
isfies (xi : (A; )j : Ai) F ae : (B);: B in the intersection type system, for 
each j € d(B). 


Proof. We prove the first part by induction on 7 (in the course of this induction, 
we recall the precise definition of mẹ). If m is the proof 


q#i= d(A,) =0 and u; is a bijection 
(Ag gaa F wis (A) 


q=1 


(so that B = u;,(A;)) then mẹ = zi. We have (43°); = [] if 

| (Aiju: z1) | and (uis(Ai)); = (Ai)u,-1(y)- It follows that (x, : ( 

zi : (B); : B is a valid axiom in the intersection type system. 
Assume that 7 is the proof 


ne 


A™,..., A”, AYE B 


nm? 


An... A FAS, B 


where 7° is the proof of the premise of the last rule of 7. By inductive hypothesis 
the A-term 7°> p satisfies (x; : (Aj*),; : Ai) x : (A")j : A F wo, : (B): B 
from which we deduce (x; : (Aj); : Ai)ii F Aw4 2°, : ((A")j, (B)j): 4> B 
which is the required judgment since mẹ = Ara n> y and ((A;");,(B);) 
(A =, B); as easily checked. 

Assume last that m ends with 


I | 


m! T? 


CH, CHAS, B D”... D®HA 
Ev... E®™ F B 


with: for each i = 1,...,n there are two disjoint sets L; and R; such that 
d(E;) = Li F Ri, Gi = Ei FE Di = Ei [Ri Wi (L= Ui, and Wi [e= UO Vi. 
Let j € d(B). By inductive hypothesis, the judgment (a; : (Cj); : Ci) F 


T> : (A =u B); : A > B is derivable in the intersection type system. Let K; = 
u`! ({j}), which is a finite subset of d(A). By inductive hypothesis again, for 
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each k € K; we have (2; : (Dj), : Dj)?_, F m? : (A), : A. Now observe that 
) 


en o B);) so that 
(zi Yet AO (Di), : Ei) H (rte) r? : (By: B 
kek; 


is derivable in intersection types (remember that C; = D; = E;). Since Tẹ = 
(x!) x? it will be sufficient to prove that 


(EP = (CH) + JO (Dir. (2) 
kEK; 
For this, since (Ej); = | (Ei) | wi(l) = j ], consider an element l of d(£;) such 
that w;(l) = j. There are two possibilities: (1) either / € L; and in that case we 
know that (F;); = (Ci); since E; tz; = C; and moreover we have u;(l) = w;(l) = j 
(2) or l € Ri. In that case we have (Ej); = (D;); since Fitr; = Di. Moreover 
u(vi(l)) = w;(l) = j and hence v;(l) € K}. Therefore 


[(E:)ı |1 E€ Li and wi(l) = j] = (Ci) | wil) = 3] = (C3) 
[ (Ei) | Le Ri and w;(1) = j] = [ (Di) | vill) € Kj] = DO (D7)k 


and (2) follows. 


Theorem 3 (Completeness). Let J C I. Let M be a X-term and z1,..., 2n 
be pairwise distinct variables, such that (a; : mi : ci) ı F M : bj:T in the 
intersection type system for all j € J. Let Aı,..., An and B be Pon and 
let u1,...,Un be almost injective functions such that u; : d(A;) > J = d(B). 
Assume also that A; = c; for each i = 1,...,n and that B = T. Last assume 
that, for all j € J, one has (B); = b; and (A*),; = =m! fori =1,...,n. Then 
the judgment (A;")"_, F B has a prota such that Tẹ} ~n M. 


Proof. By induction on M. Assume first that M = x; for some i € {1,...,n}. 
Then we must have T = o;, mj = |] for q # i and mi = [bj] for all j € J. 
Therefore d(A,) = @ and ug is the empty function for q Æ i, u; is a bijection 
d(A;) > J and Vk € d(A;) (Ai)k = bu:(k); in other words uj;,(A;) ~ B. By 
Theorem 1 we know that the judgment (u;,(A;))'4 + B is provable in LJ(Z) with 
a proof p such that p, ~n x. We have a proof @ of (Aj")iL, F uix(Ai) which 
consists of an axiom so that 6 = x; and hence by Lemma 6 we have a proof m 
of (Aj")7_, F B such that rz = p, [0g /a] ~n Ti. 

Assume that M = Ax? N, that 7 = je = ) and that we have a fam- 
ily of deductions (for j € J) of (x; : m? :o;)P., F M : (m’,c;):0 > % with 
bj = (mf, cj) and the premise of this conclusion in each of these deductions is 
(a,:mi:o,)%,,0:mi:oF N: cj: p. We must have B = (C =, D) with 
D = ọ, C =o, dD) = J, u : d(C) > d(D) almost injective, (D); = cj and 
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[(C)x | k € d(C) and u(k) = j] = mî, that is (C“); = mî, for each j € J. 
By inductive hypothesis we have a proof p of (Aj‘)"_,,C" H D such that 
P „ ~n N from which we obtain a proof m of (Aj")7_, F C =. D such that 
Ta = Ax? P „~n M as expected. 

Assume last that M = (N) P and that we have a J-indexed family of deduc- 
tions (z; : m? :0;)%, F M :bj:T. Let Aj,...,An, u1,..., tn and B be LJ(I) 
formulas and almost injective functions as in the statement of the theorem. 

Let j € J. There is a finite set L; C I and multisets mÍ”, (mi Jier, such 


that we have deductions? of (a: m2": o;), F N : (l[a? |lEL;],b;): 0 >r 


and, for each | € Lj, of (x; : mi! pe) Pea; : o with 


mi = me” + mi" : (3) 
leL,; 


We assume the finite sets L; to be pairwise disjoint (this is possible because I 
is infinite) and we use L for their union. Let u : L + J be the function which 
maps l € L to the unique j such that l € Lj, this function is almost injective. 


Let A be an LL(J) formula such that A = ø, d(A) = L and (A); = at, such a 
formula exists by Proposition 1. 
Let i € {1,...,n}. For each j € J we know that 


[ (Ai) |r € d(As) and uj(r) = j] = m} =m}? + X mi" 
leL; 


and hence we can split the set d(A;) N u;~'({j}) into disjoint subsets R° and 
(Re Jier; in such a way that 


[(Ade | re RE? ] =m}? and Vie Lj [(Ai), |r E RI] =m". 


We set R? = U je yR: observe that this is a disjoint union because Rİ? C 
uj '({j}). Similarly we define R} = Uz RYO" which is a disjoint union for 
the following reason: if l,” € L satisfy u(l) = u(l') = j then R?' and RIY 
have been chosen disjoint and if u(l) = j and u(l’) = 7’ with j 4 7’ we have 
RI! C u; tH j} and RI”? C us} ({j’}). Let vi : RI? + L be defined by: v;(r) is 
the unique l € L such that r € REO, Since each RI! is finite the function v; is 
almost injective. Moreover u o v; = u; Ì RI 

We use ui; for the restriction of u; to R? so that u; : R? > J. By induc- 
tive hypothesis we have ((A;[,0)“*)?_, F A >, B with a proof p such that 
H ~n N. Indeed [(Ailpo)r |r € R} and ui(r) = j] = m$’ and (A >, B); = 
({a? | u(l) = j ], bj) for each j € J. For the same reason we have ((A;] ri)” fy F 
A with a proof p such that p- ~, P. Indeed for each | € L = d(A) we have 


T Notice that our A-calculus is in Church style and hence the type ø is uniquely 
determined by the sub-term N of M. 
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[(Ailpr)r | vilr) = 1] = mi" and (A); = al where j = u(l). By an application 
rule we get a proof 7 of (Aj")?_, F B such that tẹ = (u>) P ~n (N)P=M 


as contended. 


4 The untyped Scott case 


Since intersection types usually apply to the pure A-calculus, we move now to 
this setting by choosing in Rel; the set Rx. as model of the pure A-calculus. The 
Rə intersection typing system has the elements of Ræ as types, and the typing 
rules involve sequents of shape (x; :m,)?_, F M : a where m; E€ Mgy(Ro) and 
a ERs: 

We use A for the set of terms of the pure à-calculus, and Ag as the pure A- 
calculus extended with a constant (2 subject to the two following ~w reduction 
rules: Aw 2 ~u N and (2) M ~u 2. We use ~nu for the least congruence on Ag 
which contains ~>, and ~œ, and similarly for ~g,,,. We define a family (H(2)) ey 
of subsets of Ag minimal such that, for any sequence X = (21,...,%,) and y = 
(y1,..-,Yr) Such that X, 7 is repetition-free, and for any terms M; € H(a;) (for 
i = 1,...,n), one has A? XY (x) My ---My,O1---O; E€ H(x) where Oj ~w R 
for j = 1,...,l. Notice that x € H(z). 

The typing rules of Rx are 


B xr:mF M:a 
Git | lerer ta hecgtn i Fria St Ar M : (m,a) 


®t M: ([ay,...,ax ], b) (jE N: aj) 
P+ 5,8; + (M)N:b 


where we use the following convention: when we write ® + W it is assumed that 
® is of shape (a; : m,)?_, and W is of shape (x; : pi), and then @ + W is 
(a; : Mmi + pi); This typing system is just a “proof-theoretic” rephrasing of the 
denotational semantics of the terms of Ag in Ry. 

Proposition 2. Let M, M' € Ag and È = (£1,..., £n) be a list of pairwise dis- 
tinct variables containing all the free variables of M and M’. Let m; E€ Mgn(Roo) 
fori =1,...,n and b E€ Ro. If M ~gn M’ then (zi: mi); F M : biff 
(ti cmi) M’: b. 


4.1 Formulas 


We define the associated formulas as follows, each formula A being given together 
with d(A) C I and (A) € RS). 


— If J CI then ey is a formula with d(e;) = J and (Ez); =e for j € J 

— and if A and B are formulas and u : d(A) — d(B) is almost injective 
then A =, B is a formula with d(A =>, B) = d(B) and (A =, B); = 
([(A)e | u(k) = 5], (Bj) € Roo. 
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We can consider that there is a type o of pure A-terms interpreted as Ry in 
Rel, such that (o > o) = o, and then for any formula A we have A = o. 

Operations of restriction and relocation of formulas are the same as in Sec- 
tion 3 (setting eyfk = €sn«) and satisfy the same properties, for instance 
(Alix) = (A) îx and one sets ux(e7) = ex ifu: J > K is a bijection. 

The deduction rules are exactly the same as those of Section 3, plus the axiom 
+ eg. With any deduction 7 of (A“*)?_, F B and sequence X = (21,...,2n) of 
pairwise distinct variables, we can associate a pure Tẹ} E€ Ag defined exactly as 
in Section 3 (just drop the types associated with variables in abstractions). If 7m 
consists of an instance of the additional axiom, we set a> = 2. 


Lemma 7. Let A, Ai,...,An be a formula such that d(A) = d(A;) = 0. Then 
(A, + A is provable by a proof m which satisfies Ty, 7, ~w Q. 
The proof is a straightforward induction on A using the additional axiom, 
Lemma 1 and the observations that if d(B >, C) = Ø then u = 09. 

One can easily define a size function sz : Rœ — N such that sz(e) = 0 and 
sz([d1,...,@x],@) = sz(a) ae al +sz(a;)). First we have to prove an adapted 
version of Proposition 1; here it will be restricted to finite sets. 


Proposition 3. Let J be a finite subset of I and f € RL. There is a formula 
A such that d(A) = J and (A) = f. 


Proof. Observe that, since J is finite, there is an N € N such that Vj € J Vq € 
N q > N > f(j)q =[] (remember that f(j) € Mén(Roo)%). Let N(f) be the 
least such N. We set sz(f) = >) j¢,8z(f(j)) and the proof is by induction on 
(sz( f), N(f)) lexicographically. 

If sz(f) = 0 this means that f(j) = e for all j € J and hence we can 
take A = ej. Assume that sz(f) > 0, one can write® f(j) = (mj,aj) with 
mj E Mgin(Roo) and a; € Ræ for each j € J. Just as in the proof of Proposition 1 
we choose a set K, a function g : K —> Ry and an almost injective function 
u : K —> J such that m; = [g(k) | u(k) =j]. The set K is finite since J is 
and we have sz(g) < sz( f) because sz(f) > 0. Therefore by inductive hypothesis 
there is a formula B such that d(B) = K and (B) = g. Let f’: J > Ro defined 
by f’(j) = aj, we have sz(f’) < sz( f) and N(f’) < N(f) and hence by inductive 
hypothesis there is a formula C such that (C) = f. We set A = (B >, C) which 
satisfies (A) = f as required. 


Theorem 1 still holds up to some mild adaptation. First notice that A ~ B 
simply means now that d(A) = d(B) and (A) = (B). 


Theorem 4. If A and B are such that A ~ B then A" + B with a proof 7 
which satisfies n, E€ H(a). 


8 This is also possible if sz(f) = 0 actually. 
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Proof. By induction on the sum of the sizes of A and B. Assume that A = €J 
so that d(B) = J and Vj € J (B); = e. There are two cases as to B. In the 
first case B is of shape ex but then we must have K = J and we can take for 
m an axiom so that m, = x E€ H(x). Otherwise we have B = (C >, D) with 
d(D) = J, Vj € J (D); =e and d(C) = 9, so that u = 0z. We have A ~ D and 
hence by inductive hypothesis we have a proof p of A'4 + D such that p= H(z). 
By weakening and =>-introduction we get a proof m of A'd + B which satisfies 
To = Ay p, E H(z). 

Assume that A = (C >, D). If B =e, then we must have d(C) = 0, u = 0y 
and D ~ B and hence by inductive hypothesis we have a proof p of D'' + B 
such that p, € H(x). By Lemma 7 there is a proof 0 of F C such that 0 ~, 22. 
Hence there is a proof 7 of A'l4+ B such that m, = P, [(x)0/y] € H(z). 

Assume last that B = (E =>, F), then we must have D ~ F and there must 
be a bijection w : d(E) > d(C) such that u o w = v and w,(E) ~ C. We reason 
as in the proof of Lemma 1: by inductive hypothesis we have a proof p of D'I4 H F 
and a proof u of w,(E)' + C from which we build a proof 7 of A! + B such 


that Tt, = Ay P, [() u, /2| € H(x) by inductive hypothesis. 


Theorem 5 (Soundness). Let 7 be a deduction tree of Aj*,..., A%™ F B and 

a sequence ofn pairwise distinct variables. Then the A-term Tẹ E€ Ag satisfies 
(xi : (AF j) F ae : (B); in the Ro intersection type system, for each j € 
d(B). 


The proof is exactly the same as that of Theorem 2, dropping all simple types. 
For all A-term M € A, we define Hp(M) as the least subset of element of 
Ag such that: 


— if O € Ag and O ~u 2 then O € H»(M) for all ME A 

— if M = z then H(z) C Ho(M) 

— if M = ày N and N € Ho(N) then Ay N’ € Ho(M) 

— if M= (N) P,Ne Ho(N) and P' € Ho(P) then (N’) P'e Ho(M). 
The elements of Ho(M) can probably be seen as approximates of M. 
Theorem 6 (Completeness). Let J C I be finite. Let M € Ag and £1,..., En 
be pairwise distinct variables, such that (x; :m?)?_, F M : bj in the Ræ inter- 
section type system for all j € J. Let Aj,...,An and B be formulas and let 
U1,+++,Un be almost injective functions such that u; : d(A;) > J = d(B). As- 
sume also that, for allj € J, one has (B); = bj and (Aj); =m? fori =1,...,n. 
Then the judgment Aj',...,Av™ F B has a proof n such that te E€ Ho(M). 


The proof is very similar to that of Theorem 3. 


5 Concluding remarks and acknowledgments 


The results presented in this paper show that, at least in non-idempotent inter- 
section types, the problem of knowing whether all elements of a given family of 
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intersection types (aj)jej are inhabited by a common A-term can be reformu- 
lated logically: is it true that one (or equivalently, any) of the indexed formulas A 
such that d(A) = J and Vj € (A); = aj is provable in LJ(I)? Such a strong con- 
nection between intersection and Indexed Linear Logic was already mentioned 
in the introduction of [2], but we never made it more explicit until now. 

To conclude we propose a typed A-calculus à la Church to denote proofs of 
the LJ(Z) system of Section 4. The syntax of pre-terms is given by s,t... := 
x[J| | Ax : A“ s | (s)t where in z[J], x is a variable and J C J and, in Ax: A” s, 
u is an almost injective function from d(A) to a set J C I. Given a pre-term 
s and a variable x, the domain of x in s is the subset dom(a,s) of I given by 
dom(z,z[J]) = J, dom(z,y[J]) = 0 if y 4 x, dom(z,Ay : A“ s) = dom(z, s) 
(assuming of course y # x) and dom(a,(s)t) = dom(z,s) U dom(a,t). Then 
a pre-term s is a term if any subterm of t which is of shape (s1) s2 satisfies 
dom(z, 81) dom(z, s2) = @ for all variable x. A typing judgment is an expression 
(x; : Aft) F s: B where the z;’s are pairwise distinct variables, s is a term 
and each u; is an almost injective function d(A;) > d(B). The following typing 
rules exactly mimic the logical rules of LJ(Z): 


d(A) =90 
(zi: APY) F 2: A 


q#i=d(A;) = Í and u; bijection (zi: A hepzi: Aks: B 
(£q : Ag" 31 F a, [d(A,)] : ui (4;) (xi: A; ei F Ag A” s: ASau B 
(xi : Aj lomite ei Fs: A>, B (a; : Aj lamias t) i1 Ft:A 


Ge A AO T: 


t 


The properties of this calculus, and more specifically of its G-reduction, and its 
connections with the resource calculus of [9] will be explored in further work. 

Another major objective will be to better understand the meaning of LJ(I) 
formulas, using ideas developed in [3] where a phase semantics is introduced and 
related to (non-uniform) coherence space semantics. In the intuitionistic present 
setting, it is tempting to look for Kripke-like interpretations with the hope of 
generalizing indexed logic beyond the (perhaps too) specific relational setting 
we started from. 

Last, we would like to thank Luigi Liquori and Claude Stolze for many helpful 
discussions on intersection types and the referees for their careful reading and 
insightful comments and suggestions. 
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Abstract. In this paper, we investigate the problem of synthesizing 
computable functions of infinite words over an infinite alphabet (data 
w-words). The notion of computability is defined through Turing machines 
with infinite inputs which can produce the corresponding infinite outputs 
in the limit. We use non-deterministic transducers equipped with registers, 
an extension of register automata with outputs, to specify functions. Such 
transducers may not define functions but more generally relations of data 
w-words, and we show that it is PSPACE-complete to test whether a given 
transducer defines a function. Then, given a function defined by some 
register transducer, we show that it is decidable (and again, PSPACE-c) 
whether such function is computable. As for the known finite alphabet 
case, we show that computability and continuity coincide for functions 
defined by register transducers, and show how to decide continuity. We 
also define a subclass for which those problems are P'TIME. 


Keywords: Data Words - Register Automata - Register Transducers - 
Functionality - Continuity - Computability. 


1 Introduction 


Context Program synthesis aims at deriving, in an automatic way, a program 
that fulfils a given specification. Such setting is very appealing when for instance 
the specification describes, in some abstract formalism (an automaton or ideally 
a logic), important properties that the program must satisfy. The synthesised 
program is then correct-by-construction with regards to those properties. It is 
particularly important and desirable for the design of safety-critical systems with 
hard dependability constraints, which are notoriously hard to design correctly. 
Program synthesis is hard to realise for general-purpose programming lan- 
guages but important progress has been made recently in the automatic synthesis 
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of reactive systems. In this context, the system continuously receives input signals 
to which it must react by producing output signals. Such systems are not assumed 
to terminate and their executions are usually modelled as infinite words over 
the alphabets of input and output signals. A specification is thus a set of pairs 
(in,out), where in and out are infinite words, such that out is a legitimate output 
for in. Most methods for reactive system synthesis only work for synchronous 
systems over finite sets of input and output signals X and I’. In this synchronous 
setting, input and output signals alternate, and thus implementations of such a 
specification are defined by means of synchronous transducers, which are Biichi 
automata with transitions of the form (q,¢,7,q'), expressing that in state q, 
when getting input o € X, output y € I’ is produced and the machine moves 
to state q’. We aim at building deterministic implementations, in the sense 
that the output y and state q’ uniquely depend on q and ø. The realisability 
problem of specifications given as synchronous non-deterministic transducers, by 
implementations defined by synchronous deterministic transducers is known to 
be decidable [14,20]. In this paper, we are interested in the asynchronous setting, 
in which transducers can produce none or several outputs at once every time 
some input is read, i.e., transitions are of the form (q,¢,w,gq’') where w € I™. 
However, such generalisation makes the realisability problem undecidable [2,9]. 


Synthesis of Transducers with Registers In the setting we just described, the set 
of signals is considered to be finite. This assumption is not realistic in general, 
as signals may come with unbounded information (e.g. process ids) that we call 
here data. To address this limitation, recent works have considered the synthesis 
of reactive systems processing data words [17,6,16,7]. Data words are infinite 
words over an alphabet X x D, where X is a finite set and D is a possibly infinite 
countable set. To handle data words, just as automata have been extended to 
register automata, transducers have been extended to register transducers. Such 
transducers are equipped with a finite set of registers in which they can store 
data and with which they can compare data for equality or inequality. While 
the realisability problem of specifications given as synchronous non-deterministic 
register transducers (NRT.y,) by implementation defined by synchronous deter- 
ministic register transducers (DRT.yn) is undecidable, decidability is recovered 
for specifications defined by universal register transducers and by giving as input 
the number of registers the implementation must have [7,17]. 


Computable Implementations In the previously mentioned works, both for finite or 
infinite alphabets, implementations are considered to be deterministic transducers. 
Such an implementation is guaranteed to use only a constant amount of memory 
(assuming data have size O(1)). While it makes sense with regards to memory- 
efficiency, some problems turn out to be undecidable, as already mentioned: 
realisability of NRT syn specifications by DRT yn, or, in the finite alphabet setting, 
when both the specification and implementation are asynchronous. In this paper, 
we propose to study computable implementations, in the sense of (partial) 
functions f of data w-words computable by some Turing machine M that has an 
infinite input x € dom(f), and produces longer and longer prefixes of the output 
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f(a) as it reads longer and longer prefixes of the input x. Therefore, such a machine 
produces the output f(x) in the limit. We denote by TM the class of Turing 
machines computing functions in this sense. As an example, consider the function 
f that takes as input any data w-word u = (01,d1)(o2,d2)... and outputs 
(o1,d,)” if dı occurs at least twice in u, and otherwise outputs u. This function is 
not computable, as an hypothetic machine could not output anything as long as 
dı is not met a second time. However, the following function g is computable. It 
is defined only on words (01, d1)(o2,d2)... such that o102--- € ((a+b)c*)”, and 
transforms any (a;,d;) by (o;i, d1) if the next symbol in {a,b} is an a, otherwise it 
keeps (o;, d;) unchanged. To compute it, a TM would need to store dı, and then 
wait until the next symbol in {a,b} is met before outputting something. Since 
the finite input labels are necessarily in ((a + b)c*)”, this machine will produce 
the whole output in the limit. Note that g cannot be defined by any deterministic 
register transducer, as it needs unbounded memory to be implemented. 

However, already in the finite alphabet setting, the problem of deciding if a 
specification given as some non-deterministic synchronous transducer is realisable 
by some computable function is open. The particular case of realisability by 
computable functions of universal domain (the set of all w-words) is known to be 
decidable [12]. In the asynchronous setting, the undecidability proof of [2] can be 
easily adapted to show the undecidability of realisability of specifications given 
by non-deterministic (asynchronous) transducers by computable functions. 


Functional Specifications As said before, a specification is in general a relation 
from inputs to outputs. If this relation is a function, we call it functional. Due to 
the negative results just mentioned about the synthesis of computable functions 
from non-functional specifications, we instead here focus on the case of functional 
specifications and address the following general question: given the specification 
of a function of data w-words, is this function “implementable”, where we define 
“implementable” as “being computable by some Turing machine”. Moreover, if it is 
implementable, then we want a procedure to automatically generate an algorithm 
that computes it. This raises another important question: how to decide whether 
a specification is functional ? We investigate these questions for asynchronous 
register transducers, here called register transducers. This asynchrony allows for 
much more expressive power, but is a source of technical challenge. 


Contributions In this paper, we solve the questions mentioned before for the 
class of (asynchronous) non-deterministic register transducers (NRT). We also 
give fundamental results on this class. In particular, we prove that: 


1. deciding whether an NRT defines a function is PSPACE-complete, 

2. deciding whether two functions defined by NRT are equal on the intersection 

of their domains is PSPACE-complete, 

the class of functions defined by NRT is effectively closed under composition, 

4. computability and continuity are equivalent notions for functions defined by 
NRT, where continuity is defined using the classical Cantor distance, 

5. deciding whether a function given as an NRT is computable is PSPACE-c, 
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6. those problems are in PTIME for a subclass of NRT, called test-free NRT. 


Finally, we also mention that considering the class of deterministic register 
transducers (DRT for short) instead of computable functions as a yardstick for 
the notion of being “implementable” for a function would yield undecidability. 
Indeed, given a function defined by some NRT, it is in general undecidable to 
check whether this function is realisable by some DRT, by a simple reduction 
from the universality problem of non-deterministic register automata [19]. 


Related Work The notion of continuity with regards to Cantor distance is not 
new, and for rational functions over finite alphabets, it was already known to be 
decidable [21]. Its connection with computability for functions of w-words over 
a finite alphabet has recently been investigated in [3] for one-way and two-way 
transducers. Our results lift some of theirs to the setting of data words. The 
model of test-free NRT can be seen as a one-way non-deterministic version of a 
model of two-way transducers considered in [5]. 


2 Data Words and Register Transducers 


For a (possibly infinite) set S, we denote by S* (resp. S“) the set of finite 
(resp. infinite) words over this alphabet, and we let S% = S* US”. For a 
word u = ui... Un, we denote ||ul| = n its length, and, by convention, for 
u € S”, |u| = co. The empty word is denoted e. For 1 < i < j < |ull, we let 
uļi:j] = uiui+ı -uz and uli] = uli] the ith letter of u. For u,v € 8°, we say 
that u is a prefix of v, written u < v, if there exists w € S° such that v = uw. 
In this case, we define u~!v = w. For u,v € S*, we say that u and v mismatch, 
written mismatch(u, v), when there exists a position 7 such that 1 <i < |lull, 
1 <i < lol] and ufi] # vfi]. Finally, for u,v € 5%, we denote by u ^v their 
longest common prefix, i.e. the longest word w € S° such that w < u and w < v. 


Data Words In this paper, X and I are two finite alphabets and D is a countably 
infinite set of data. We use letter ø (resp. y, d) to denote elements of X (resp. 
I’, D). We also distinguish an arbitrary data value do € D. Given a set R, let 
TË be the constant function defined by 74?(r) = do for all r € R. Given a finite 
alphabet A, a labelled data is a pair x = (a,d) € A x D, where a is the label and 
d the data. We define the projections lab(a) = a and dt(a) = d. A data word over 
A and D is an infinite sequence of labelled data, i.e. a word w € (A x D)”. We 
extend the projections lab and dt to data words naturally, i.e. lab(w) € AY and 
dt(w) € D”. A data word language is a subset L C (A x D)”. Note that here, 
data words are infinite, otherwise they are called finite data words. 


2.1 Register Transducers 


Register transducers are transducers recognising data word relations. They are 
an extension of finite transducers to data word relations, in the same way register 
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automata [15] are an extension of finite automata to data word languages. Here, 
we define them over infinite data words with a Büchi acceptance condition, and 
allow multiple registers to contain the same data, with a syntax close to [18]. 
The current data can be compared for equality with the register contents via 
tests, which are symbolic and defined via Boolean formulas of the following form. 
Given R a set of registers, a test is a formula @ satisfying the following syntax: 


$ = T|Llr=|r*|¢Ag|ove|-7¢ 


where r € R. Given a valuation T : R > D, a test @ and a data d, we denote 
by T,d = ọ the satisfiability of ¢ by d in valuation 7, defined as 7,d — r= if 
T(r) = d and 7,d |r if T(r) 4d. The Boolean combinators behave as usual. 
We denote by Tstr the set of (symbolic) tests over R. 


Definition 1. A non-deterministic register transducer (NRT) is a tuple T = 
(Q,R,io, F, A), where Q is a finite set of states, io E€ Q is the initial state, 
F C Q is the set of accepting states, R is a finite set of registers and A C 
Q x X x Tstg x 2% x (T x R)* x Q is a finite set of transitions. We write 

o,olasgn,o 


= q' for (q,¢,¢, asgn, 0, q') E A (T is sometimes omitted). 


The semantics of a register transducer is given by a labelled transition system: 
we define Lr = (C, A, —), where C = Q x (R > D) is the set of configurations, 
A= (X xD) x (Ix D)* is the set of labels, and we have, for all (q, T), (q', 7’) € C 


l, ae 
and for all (l, w) € A, that (q¢,7) — (q',7’) whenever there exists a transition 


o,olasgn,o 
T 


— (Matching labels) o = o’ 

— (Compatibility) d satisfies the test ¢ € Tstpg, i.e. T, d = @. 

— (Update) r’ is the successor register configuration of r with regards to d and 
asgn: T'(r) = d if r € asgn, and T'(r) = T(r) otherwise 

— (Output) By writing o = (91,71)... (%m;1m), we have that m = n and for 
alll <i <n, yi =; and d; = T' (ri). 


qd such that, by writing l = (o’,d) and w = (y4, d1)... (Yh, dn): 
1 n 


Then, a run of T is an infinite sequence of configurations and transitions 


p = (qo; To) fut), (qi, 71) fee), ---. Its input is in(p) = uju2..., its output is 
T T 

out(p) = v1 + v2.... We also define its sequence of states st(p) = qoqi ..., and its 

trace tr(p) = Uy -¥1-Ug-v2.... Such run is initial if (qo, T0) = (io, TÈ). It is final if it 


satisfies the Biichi condition, i.e. inf(st) F # Ø, where inf(st) = {q4 E Q | q = qi 
for infinitely many i}. Finally, it is accepting if it is both initial and final. We 
then write (qo, To) e, to express that there is a final run p of T starting from 


(qo, To) such that in(p) = u and out(p) = v. In the whole paper, and unless stated 
otherwise, we always assume that the output of an accepting run is infinite 
(v € (T x D)®), which can be ensured by a Büchi condition. 

A partial run is a finite prefix of a run. The notions of input, output and states 


are extended by taking the corresponding prefixes. We then write (qo, To) =, 
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(dn, Tn) to express that there is a partial run p of T starting from configuration 
(qo, To) and ending in configuration (qn, Tn) such that in(p) = u and out(p) = v. 
Finally, the relation represented by a transducer T is: 


[T] = { (u,v) € (X x D)” x (T x D)” | there exists an accepting run p of T 
such that in(p) = u and out(p) = v} 


Example 2. As an example, consider the register transducer Trename depicted in 
Figure 1. It realises the following transformation: consider a setting in which we 
deal with logs of communications between a set of clients. Such a log is an infinite 
sequence of pairs consisting of a tag, chosen in some finite alphabet X, and the 
identifier of the client delivering this tag, chosen in some infinite set of data values. 
The transformation should modify the log as follows: for a given client that needs 
to be modified, each of its messages should now be associated with some new 
identifier. The transformation has to verify that this new identifier is indeed free, 
i.e. never used in the log. Before treating the log, the transformation receives as 
input the id of the client that needs to be modified (associated with the tag del), 
and then a sequence of identifiers (associated with the tag ch), ending with #. 
The transducer is non-deterministic as it has to guess which of these identifiers 
it can choose to replace the one of the client. In particular, observe that it may 
associate multiple output words to a same input if two such free identifiers exist. 


ch, T | 2, € ch, T | 2, € o, rI | Ø, (0,r2) 


ch,r* |r2,e 


del, T | r1,€ 


or? Are | ro, (ø, ro) 


Fig. 1. A register transducer Trename. It has three registers r1, r2 and ro and four states. 
o denotes any letter in X, rı stores the id of del and r2 the chosen id of ch, while ro 
is used to output the last data value read as input. As we only assign data to single 
registers, we write r; for the singleton assignment set {r;}. 


Finite Transducers Since we reduce the decision of continuity and functionality 
of NRT to the one of finite transducers, let us introduce them: a finite transducer 
(NFT for short) is an NRT with 0 registers (i.e. R = Ø). Thus, its transition 
relation can be represented as A C Q x X x I* x Q. A direct extension of the 
construction of [15, Proposition 1] allows to show that: 


Proposition 3. Let T be an NRT with k registers, and let X Cy D be a finite 
subset of data. Then, [T] O (X x X)* x (I x X)” is recognised by an NFT of 
exponential size, more precisely with O(|Q| x |X|!) states. 


2.2 Technical Properties of Register Automata 


Although automata are simpler machines than transducers, we only use them as 
tools in our proofs, which is why we define them from transducers, and not the 
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other way around. A non-deterministic register automaton, denoted NRA, is a 
transducer without outputs: its transition relation is A C Q x X x Tstg x 2% x 
{e} x Q (simply represented as A C Q x X x Tstz x 2% x Q). The semantics are 
the same, except that now we lift the condition that the output v is infinite since 
there is no output. For A an NRA, we denote L(A) = {u € (X x D)” | there 
exists an accepting run p of A over u}. Necessarily the output of an accepting 
run is £. In this section, we establish technical properties about NRA. 

Proposition 4, the so-called “indistinguishability property”, was shown in the 
seminal paper by Kaminski and Francez [15, Proposition 1]. Their model differs 
in that they do not allow distinct registers to contain the same data, and in the 
corresponding test syntax, but their result easily carries to our setting. It states 
that if an NRA accepts a data word, then such data word can be relabelled with 
data from any set containing do and with at least k + 1 elements. Indeed, at any 
point of time, the automaton can only store at most k data in its registers, so 
its notion of “freshness” is a local one, and forgotten data can thus be reused as 
fresh ones. Moreover, as the automaton only tests data for equality, their actual 
value does not matter, except for dg which is initially contained in the registers. 

Such “small-witness” property is fundamental to NRA, and will be paramount 
in establishing decidability of functionality (Section 3) and computability (Sec- 
tion 4). We use it jointly with Lemma 5, which states that the interleaving of the 
traces of runs of an NRT can be recognised with an NRA, and Lemma 6, which 
expresses that an NRA can check whether interleaved words coincide on some 
bounded prefix, and/or mismatch before some given position. 


Proposition 4 ([15]). Let A be an NRA with k registers. If L(A) # Ø, then, 
for any X CD of size |X| > k+1 such that do E X, L(A)N(L x XV # Ø. 


The runs of a register transducer T can be flattened to their traces, so as to 
be recognised by an NRA. Those traces can then be interleaved, in order to be 
compared. The proofs of the following properties are straightforward. 

(ui,u}) (v1,07) 

Let pı = (go, T0) ~> (41,71)... and pa = (po, po) > (pi, m)... be 

two runs of a transducer T. Then, we define their interleaving p1 p2 = u1 -U4 v1: 


vi u2: uh: v25... and La(T) = {p1 8 p2 | pı and p2 are accepting runs of T}. 
Lemma 5. [fT has k registers, then Lo(T) is recognised by an NRA with 2k 
registers. 


Lemma 6. Let i,j € NU{oo}. We define M; = {uui vivi s | Vk > 1, Uk, Uk € 
(X x D), up Vp E (T x D)*,V1 < k <j, un = up and |luy-ug---Avy-vg...|| < ih 
Then, M} is recognisable by an NRA with 2 registers and with 1 register if i = co. 


3 Functionality, Equivalence and Composition of NRT 


In general, since they are non-deterministic, NRT may not define functions but 
relations, as illustrated by Example 2. In this section, we first show that deciding 
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whether a given NRT defines a function is PSPACE-complete, in which case we call 
it functional. We show, as a consequence, that testing whether two functional NRT 
define two functions which coincide on their common domain is PSPACE-complete. 
Finally, we show that functions defined by NRT are closed under composition. 
This is an appealing property in transducer theory, as it allows to define complex 
functions by composing simple ones. 


Example 7. As explained before, the transducer Trename described in Example 2 
is not functional. To gain functionality, one can reinforce the specification by 
considering that one gets at the beginning a list of k possible identifiers, and that 
one has to select the first one which is free, for some fixed k. This transformation 
is realised by the register transducer Tyename2 depicted in Figure 2 (for k = 2). 


del, T | r1,€ ch,rÝ | ro, € ch,r? A rž | rs,€ 


o,r | Ø,(0,r3) o,rī | Ø, (0, r3) 


a 


o,r Art | ro, (o,ro) 7? Ar} Ary | ro, (0, ro) 0,77 Ar} | ro, (9,70) 


Fig. 2. A NRT Trename2, with four registers 71, 72,73 and ro (the latter being used, as in 
Figure 1, to output the last read data). After reading the # symbol, it guesses whether 
the value of register r2 appears in the suffix of the input word. If not, it goes to state 
5, and replaces occurrences of rı by r2. Otherwise, it moves to state 6, waiting for an 
occurrence of r2, and replaces occurrences of rı by r3. 


Let us start with the functionality problem in the data-free case. It is al- 
ready known that checking whether an NFT over w-words is functional is decid- 
able [13,11]. By relying on the pattern logic of [10] designed for transducers of 
finite words, it can be shown that it is decidable in NLOGSPACE. 


Proposition 8. Deciding whether an NFT is functional is in NLOGSPACE. 


The following theorem shows that a relation between data-words defined by an 
NRT with & registers is a function iff its restriction to a set of data with at 
most 2k + 3 data is a function. As a consequence, functionality is decidable as it 
reduces to the functionality problem of transducers over a finite alphabet. 


Theorem 9. Let T be an NRT with k registers. Then, for all X C D of size 
|X| > 2k +3 such that do E€ X, we have that T is functional if and only if 
IT] A(X x X)” x (I x X)”) is functional. 


Proof. The left-to-right direction is trivial. Now, assume T is not functional. Let 
x € (X x D)” be such that there exists y,z € (I x D)* such that y Æ z and 
(x,y), (x,z) € [T]. Let i = ||y A z||. Then, consider the language L = {p1 ® p2 | pı 
and p2 are accepting runs of T, in(p1) = in(p2) and ||out(p1)^out(p2)|| < i}. Since, 
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by Lemma 5, Lg (T) is recognised by an NRA with 2k registers and, by Lemma 6, 
Mi, is recognised by an NRA with 2 registers, we get that L = Le(T)N MÅ is 
recognised by an NRA with 2k + 2 registers. 

Now, L # Ø, since, by letting pı and p2 be the runs of T both with input x and 
with respective outputs y and z, we have that w = p1 8 p2 E€ L. Let X C D such 
that |X| > 2k +3 and do € X. By Proposition 4, we get that LA (X x X)” # Ø. 
By letting w’ = p} 8 ph E LAO (X x X)”, and x’ = in(p1) = in(p4), y’ = out (p 
and z’ = out( p3), we have that (2’,y’), (x', 2’) € [T] O (£ x X)% x (T x X)” 
and |ly’ A z'|| < i, so, in particular, y’ 4 z’ (since both are infinite words). Thus, 
IT] A ((2 x X)” x (I x X)*) is not functional. 


—, 


As a consequence of Proposition 8 and Theorem 9, we obtain the follow- 
ing result. The lower bound is obtained by encoding non-emptiness of register 
automata, which is PSPACE-complete [4]. 


Corollary 10. Deciding whether an NRT T is functional is PSPACE-complete. 


Hence, the following problem on the equivalence of NRT is decidable: 


Theorem 11. The problem of deciding, given two functions f,g defined by NRT, 
whether for all x € dom(f) N dom(g), f(x) = g(x), is PSPACE-complete. 


Proof. The formula Vz € dom(f) N dom(g) - f(x) = g(x) is true iff the relation 
fUg={(2,y) | y = f(x) V y = g(x)} is a function. The latter can be decided by 
testing whether the disjoint union of the transducers defining f and g defines a 
function, which is in PSPACE by Corollary 10. To show the hardness, we similarly 
reduce the emptiness problem of NRA A over finite words, just as in the proof of 
Corollary 10. In particular, the functions fı and fọ defined in this proof (which 
have the same domain) are equal iff L(A) = Ø. 


Note that under the promise that f and g have the same domain, the latter 
theorem implies that it is decidable to check whether the two functions are 
equal. However, checking dom(f) = dom(g) is undecidable, as the language- 
equivalence problem for non-deterministic register automata is undecidable, since, 
in particular, universality is undecidable [19]. 

Closure under composition is a desirable property for transducers, which 
holds in the data-free setting [1]. We show that it also holds for functional NRT. 


Theorem 12. Let f,g be two functions defined by NRT. Then, their composition 
fog is (effectively) definable by some NRT. 


Proof (Sketch). By fog we mean fog: x f(g(x)). Assume f and g are 
defined by Ty = (Qs, Rf, qo, Ff, Ap) and T, = (Qg, Rg, po, Fg, Ag) respectively. 
Wlog we assume that the input and output finite alphabets of Ty and Ty are 
all equal to X, and that Ry and Rg are disjoint. We construct T such that 
[T] = fog. The proof is similar to the data-free case where the composition is 
shown via a product construction which simulates both transducers in parallel, 
executing the second on the output of the first. Assume T, has some transition 
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p aoa s. q where o € (X x R,)*. Then T has to be able to execute transitions 


of Ty while processing o, even though o does not contain any concrete data values 
(it is here the main important difference with the data-free setting). However, 
if T knows the equality types between Ry and Rg, then it is able to trigger the 
transitions of T;. For example, assume that o = (a,r,) and assume that the 
content of rg is equal to the content of rs, rf being a register of T's, then if Ty has 


ar; {rz },0" j . oe 
————— q then T can trigger the transition 


some transition of the form p' 

ool{r}Ufrp:=rgho |, | f 7 : i 
(p,q) > (p',q') where the operation r} := rg is a syntactic sugar 
19 


on top of NRT that intuitively means “put the content of rg into rọ”. 


Remark 18. The proof of Theorem 12 does not use the hypothesis that f and g 
are functions, and actually shows a stronger result, namely that relations defined 
by NRT are closed under composition. 


4 Computability and Continuity 


We equip the set of (finite or infinite) data words with the usual distance: for 
u,v € (X xD), d(u,v) = 0 if u = v and d(u, v) = 27!" otherwise. A sequence 
of (finite or infinite) data words (£n)nen converges to some infinite data word x 
if for all e > 0, there exists N > 0 such that for all n > N, d(ap,x) < €. 

In order to reason with computability, we assume in the sequel that the 
infinite set of data values D we are dealing with has an effective representation. 
For instance, this is the case when D = N. 

We now define how a Turing machine can compute a function of data words. 
We consider deterministic Turing machines, which three tapes: a read-only one- 
way input tape (containing the infinite input data word), a two-way working tape, 
and a write-only one-way output tape (on which it writes the infinite output data 
word). Consider some input data word x € (X x D)”. For any integer k € N, we 
let M(x, k) denote the output written by M on its output tape after having read 
the & first cells of the input tape. Observe that as the output tape is write-only, 
the sequence of data words (M (x, k))ķ>o is non-decreasing. 


Definition 14 (Computability). A function f : (X x D)* > (T x D)” is 
computable if there exists a deterministic multi-tape machine M such that for all 
x €dom(f), the sequence (M(x,k))x>0 converges to f(x). 


Definition 15 (Continuity). A function f : (X x D)? > (I x D)” is contin- 
uous at x € dom(f) if (equivalently): 


(a) for all sequences of data words (£n)nen converging towards x, where for all 
i € N, x; E€ dom(f), we have that (f(an))nen converges to f(x). 
(b) Vi > 0,57 > 0,Vy € dom(f), x A yll > j > IF@)A FM) 2 i. 


Then, f is continuous if and only if it is continuous at each x € dom( f). Finally, 
a functional NRT T is continuous when |T] is continuous. 
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Example 16. We give an example of a non-continuous function f. The finite input 
and output alphabets are unary, and are therefore ignored in the description 
of f. Such function associates with every sequence s = d,d2:-- € DY the word 
f(s) = df if dı occurs infinitely many times in s, otherwise f(s) = s itself. 

The function f is not continuous. Indeed, by taking d Æ d’, the sequence of 
data words d(d')"d” converges to d(d’)”, while f(d(d’)"d”) = d” converges to 
d? # f(d(d’)*) = d(ay’. 

Moreover, f is realisable by some NRT which non-deterministically guesses 
whether dı repeats infinitely many times or not. It needs only one register r in 
which to store d,. In the first case, it checks whether the current data d is equal 
the content r infinitely often, and in the second case, it checks that this test 
succeeds finitely many times, using Büchi conditions. 

One can show that the register transducer Tyename2 considered in Example 7 
also realises a function which is not continuous, as the value stored in register r2 
may appear arbitrarily far in the input word. One could modify the specification 
to obtain a continuous function as follows. Instead of considering an infinite log, 
one considers now an infinite sequence of finite logs, separated by $ symbols. The 
register transducer Trename3, depicted in Figure 3, defines such a function. 


del, T | ri,e ch, r? | r2,€ ch,rÝ Arz | r3,€ 


a,r Ary | ro, (0,70) o,r? Ark Arg | ro, (0,70) a,r A rž | ro, (0,70) 


Fig. 3. A register transducer Trename3. This transducer is non-deterministic, yet it defines 
a continuous function. 


We now prove the equivalence between continuity and computability for 
functions defined by NRT. One direction, namely the fact that computability 
implies continuity, is easy, almost by definition. For the other direction, we rely 
on the following lemma which states that it is decidable whether a word v can be 
safely output, only knowing a prefix u of the input. In particular, given a function 
f, we let f be the function defined over all finite prefixes u of words in dom(f) 
by f(u) = N(f (uy) | uy € dom(f)), the longest common prefix of all outputs of 
continuations of u by f. Then, we have the following decidability result: 


Lemma 17. The following problem is decidable. Given an NRT T defining a 
function f, two finite data words u € (X x D)* and v € (T x D)*, decide whether 


v < f(u). 
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Theorem 18. Let f be a function defined by some NRT T. Then f is continuous 
iff f is computable. 


Proof. <= Assuming f = [T] is computable by some Turing machine M, we show 
that f is continuous. Indeed, consider some x € dom(f), and some i > 0. As the 
sequence of finite words (M (x, k))ken converges to f(x) and these words have 
non-decreasing lengths, there exists j > 0 such that |M (x, j)| > i. Hence, for 
any data word y € dom(f) such that |x A y| > j, the behaviour of M on y is the 
same during the first j steps, as M is deterministic, and thus |f (x) A f(y)| > i, 
showing that f is continuous at x. 

= Assume that f is continuous. We describe a Turing machine computing f; 
the corresponding algorithm is formalised as Algorithm 1. When reading a finite 
prefix a[:7] of its input x € dom(f), it computes the set P; of all configurations 
(q, T) reached by T on 2[:j]. This set is updated along taking increasing values 
of j. It also keeps in memory the finite output word oj that has been output so 
far. For any j, if dt(«[:j]) denotes the data that appear in x, the algorithm then 
decides, for each input (o, d) € X x (dt(a[:7]) U {do}) whether (c,d) can safely 
be output, i.e., whether all accepting runs on words of the form 2[:j]y, for an 
infinite word y, outputs at least 0;(0,d). The latter can be decided, given T, oj 
and z{:7], by Lemma 17. Note that it suffices to look at data in dt(a[:7]) U {do} 
only since, by definition of NRT, any data that is output is necessarily stored in 
some register, and therefore appears in x[:j] or is equal to do. Let us show that 


Algorithm 1: Algorithm describing the machine My computing f. 


Data: x € dom(f) 
1o:=e; 
2 for j = 0 to œ do 
for (o,d) E€ X x (dt(a[:j]) U {do}) do 
if 0.(o,d) < f(æ|:j]) then // such test is decidable by Lemma 17 
0 :=0.(a, da); 
output (o, d); 
end 
end 


oonan V 


end 


My actually computes f. Let x € dom(f). We have to show that the sequence 
(My(x,7)); converges to f(a). Let oj be the content of variable o of Mr when 
exiting the inner loop at line 8, when the outer loop (line 2) has been executed 
j times (hence j input symbols have been read). Note that 0; = Mẹ(x, j). We 
have o1 < 02 x... and oj < flieli) for all j > 0. Hence, 0; < f(x) for all 
j > 0. To show that (0;); converges to f(x), it remains to show that (0,), is 
non-stabilising, i.e. Oi < Oi, < ... for some infinite subsequence tı < ig <.... 
First, note that f being continuous is equivalent to the sequence (f (a[:k]))x 
converging to f(a). Therefore we have that f(a) f(a[:k]) can be arbitrarily long, 
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for sufficiently large k. Let j > 0 and (ø, d) = f(x)[[o;|+1]. By the latter property 
and the fact that 0;.(0,d) < f(x), necessarily, there exists some k > j such that 
0;.(0,d) 3 f(a[:k]). Moreover, by definition of NRT, d is necessarily a data that 
appears in some prefix of x, therefore there exists k’ > k such that d appears in 
x[:k'] and 0;.(0, d) < f(a[:k] < f(æ[|:k']. This entails that 0;.(0,d) < og. So, we 
have shown that for all for all j, there exists k’ > j such that oj < og, which 
concludes the proof. 


Now that we have shown that computability is equivalent with continuity for 
functions defined by NRT, we exhibit a pattern which allows to decide continuity. 
Such pattern generalises the one of [3] to the setting of data words, the difficulty 
lying in showing that our pattern can be restricted to a finite number of data. 


Theorem 19. Let T be a functional NRT with k registers. Then, for all X C D 
such that |X| > 2k +3 and do E€ X, T is not continuous at some x € (X x D)” 
if and only if T is not continuous at some z E€ (X x X)”. 


Proof. The right-to-left direction is trivial. Now, let T be a functional NRT with 
k registers which is not continuous at some x € (X x D)”. Let f : dom([7]) > 
(T x D)” be the function defined by T, as: for all u € dom([T]), f(u) = v where 
v € (I. x D)” is the unique data word such that (u,v) € [T]. 

Now, let X C D be such that |X| > 2k +3 and do € X. We need to build two 
words u and v labelled over X which coincide on a sufficiently long prefix to allow 
for pumping, hence yielding a converging sequence of input data words whose 
images do not converge, witnessing non-continuity. To that end, we use a similar 
proof technique as for Theorem 9: we show that the language of interleaved runs 
whose inputs coincide on a sufficiently long prefix while their respective outputs 
mismatch before a given position is recognisable by an NRA, allowing us to use 
the indistinguishability property. We also ask that one run presents sufficiently 
many occurrences of a final state qf, so that we can ensure that there exists a 
pair of configurations containing qf which repeats in both runs. 

On reading such u and v, the automaton behaves as a finite automaton, since 
the number of data is finite ([15, Proposition 1]). By analysing the respective runs, 
we can, using pumping arguments, bound the position on which the mismatch 
appears in u, then show the existence of a synchronised loop over u and v after 
such position, allowing us to build the sought witness for non-continuity. 


Relabel over X Thus, assume T is not continuous at some point x € (X x D)”. 
Let p be an accepting run of T over x, and let gf € inf(st(g)) MF be an accepting 
state repeating infinitely often in p. Then, let 7 > 0 be such that for all j > 0, 
there exists y E€ dom( f) such that ||x Ay|| > j but || f(a) A f(y)|| < i. Now, define 
K = |Q| x (2k +3)?* and let m = (2i +3) x (K +1). Finally, pick j such that 
p{1:j] contains at least m occurrences of qf. Consider the language: 


L = {p1 ® pa|llin(p1) A in(p2)|| = J, llout(p1) A out(p2)|| < i and 


there are at least m occurrences of qf in p1[1:5]} 
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By Lemma 5, Le(T) is recognised by an NRA with 2k registers. Additionnally, by 
Lemma 6, M. 3 is recognised by an NRA with 2 registers. Thus, L = Lg(T)NOY jN 
Mj, where OF checks there are at least m occurrences of qf in p1[1:7] (this is 
easily doable from the automaton recognising Lg(T) by adding an m-bounded 
counter), is recognisable by an NRA with 2k + 2 registers. 

Choose y € dom(f) such that ||a A y|| > 7 but || f(x) A f(y)|| < i. By letting 
pı (resp. p2) be an accepting run of T over x (resp. y) we have p1 8 p2 E€ L, so 
L + Ø. By Proposition 4, LA ((2 x X)” x (I x X)*) # Ø. Let w = p1 8 ph € 
LN (Sx X)” x (Px X)”), u = in(p}) and v = in(ps). Then, |u Av] > j, 
\|f(u) A f(v)|| < i and there are at least m occurrences of qf in p1[1:5]. 

Now, we depict pi and p, in Figure 4, where we decompose u as u = 
U1 ...Um's and v as U = U1 .. . Umt; their corresponding images being respectively 
uw = uh... Uhn s and u” = ut... ut”. We also let 1 = (i + 1)(K +1) and 
U = 2(i + 1)(& +1). Since the data of u,v and w belong to X, we know that 
Ti, ye R> X. 


© uy | ui u | uy Ou | uipa uy | uy D +1 | Upya e) 
u = ms ONA CEES, OA 


(i +1) E i + 1) occurrences of qf (i+ e aacontt K + 1) occurrences of qf (K+ n Rr aae occurrences of qf 


D 1 ” f ” ” n 
us | ul u | uy wyi | utha w | uy ura | urp Um | Um tt 
ip, d >o qT > Poe am: Tm > 


Fig. 4. Runs of f over u = u1... Um sS andv=wU1...Um:t. 


Repeating configurations First, let us observe that in a partial run of p} containing 
more than |Q| x |X|* occurrences of qf, there is at least one productive transition, 
i.e. a transition whose output is o Æ €. Otherwise, by the pigeonhole principle, 
there exists a configuration u : R —> X such that (qf, u) occurs at least twice 
in the partial run. Since all transitions are improductive, it would mean that, 


25 ; 
(as, u). This 
partial run is part of p), so, in particular, (qf, p) is accessible, — by taking 


by writing w the corresponding part of input, we have (qf, u) —> 


wo such that (io, 7) ones (qf, u), we have that f(wow”) = wh, which is a 
finite word, contradicting our assumption that all accepting runs produce an 
infinite output. This implies that, for any n > |Q] x |X|* (in particular for n = 1), 
ut. unl Zitt. 


Locate the mismatch Again, upon reading wj41...uy, there are (i +1)(K +1) 
occurrences of qf. There are two cases: 


(a) There are at least i + 1 productive transitions in p4. Then, we obtain that 
ut.. uf || > i, so mismatch(u ... up, uf... un), since we know || f(u) A 
f(v)|| < i and they are respectively prefixes of f(u) and f(v), both of length at 
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least i+1. Afterwards, upon reading uy 41...Um, there are K+1 > |Q|x|X|?* 
occurrences of gf, so, by the pigeonhole principle, there is a repeating pair: 
there exist indices p and p’ such that l’ < p < p' < mand (qf, Hp) = (qf, Mp’); 
(dp; Tp) = (qp, Tp’). Thus, let zp = u1... Up, ZR = Upti--- Up and zo = 
Up’ 41++-Um +t (P stands for prefix, R for repeat and C for continuation; we 
use capital letters to avoid confusion with indices). By denoting zp = uj . .- Up, 
ZR = Uppy +++ Upo Zp = U1 ++ Up, ZR = Uppi- Uy and 2G = Uy yy... Uy, E” 
the corresponding images, z = zp: zr” is a point of discontinuity. Indeed, 
define (Zn)nen as, for all n € N, zn = zp- 2+ zc. Then, (zn)nen converges 


towards z, but, since for all n € N, f(z) = 26-2" - zh, we have that 


Flea) 4> f(z) = zp z4”, since mismatch (zp, 24%). 

(b) Otherwise, by the same reasoning as above, it means there exists a repeating 
pair with only improductive transitions in between: there exist indices p 
and p' such that 1 < p < p' < l, (ap, Mp) = (4f, Mp’), (dp: Tp) = (Qp; Tr’); 


Up+1---Up |E Upt1---Up |E 


and (GF Hp) ~~~ (df, Hp’), (dp; Tp) 


taking zp = u1... Up, ZR = Up+1 -- -Up and zc = Up+1 -- -Um t, we have, 
by letting zp = Ui... Up ZR = Upgi--- Ups Zp = U] ..-Up, ZR = € and 
rye Ut", that z = zp- zR” is a point of discontinuity. Indeed, 
define (Zn)nen as, for all n € N, zn = zp-2%- zc. Then, (zn)nen indeed 
converges towards z, but, since for all n € N, f(2n) = 26-24, we have 


that f (zn z) = 2'5+ zh”, since mismatch(z‘,, z% - 2/4) (the mismatch 
es PZR Pižp“ 2 


(dp', Tp). Then, by 


necessarily lies in zp, since ||zp|| > i+ 1). 


Corollary 20. Deciding whether an NRT defines a continuous function is 
PSPACE-complete. 


Proof. Let X C D be a set of size 2k + 3 containing dg. By Theorem 19, T is not 
continuous iff it is not continuous at some z € (X x X)“, iff [T] 9 ((X x X)” x 
(T xX )”) is not continuous. By Proposition 3, such relation is recognisable by a 
finite transducer Tx with O(|Q| x |X|!®!) states, which can be built on-the-fly. 
By [3], the continuity of functions defined by NFT is decidable in NLOGSPACE, 
which yields a PSPACE procedure. 

For the hardness, we reduce again from the emptiness problem of register 
automata, which is PSPACE-complete [4]. Let A be a register automaton over 
some alphabet X x D. We construct a transducer T which defines a continuous 
function iff L(A) = Ø iff the domain of T is empty. Let f be a non-continous 
function realised by some NRT 4H (it exists by Example 16). Then, let # Z X be 
a fresh symbol, and define the function g as the function mapping any data word 
of the form w(#, d)w’ to w(#, d)f(w’) if w € L(A). The function g is realised by 
an NRT which simulates A and copies its inputs on the output to implement the 
identity, until it sees #. If it was in some accepting state of A before seeing #, it 
branches to some initial state of H and proceeds executing H. If there is some 
wo E€ L(A), then the subfunction gw, mapping words of the form wo(#, d)w’ 
to wo(#,d)f(w’) is not continuous, since f is not. Hence g is not continuous. 
Conversely, if L(A) = Ø, then dom(g) = Ø, so g is continuous. 
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In [3], non-continuity is characterised by a specific pattern (Lemma 21, Figure 1), 
i.e. the existence of some particular sequence of transitions. By applying this 
characterisation to the finite transducer recognising [T]M ((X x X)* x (r x X)*), 
as constructed in Proposition 3, we can characterise non-continuity by a similar 
pattern, which will prove useful to decide (non-)continuity of test-free NRT in 
NLoGSPACE (cf Section 5): 


Corollary 21 ([3]). Let T be an NRT with k registers. Then, for all X C D 
such that |X| > 2k +3 and do E€ X, T is not continuous at some x € (X x D)” 
if and only if it has the pattern of Figure 5. 


Fig. 5. A pattern characterising non-continuity of functions definable by an NRT: we 
ask that there exist configurations (qf, p) and (q,7), where q+ is accepting, as well as 
finite input data words u, v, finite output data words u’, v’, uw”, v”, and an infinite input 
data word w admitting an accepting run from configuration (q, T) producing output 


w”, such that mismatch(u’, u”) V (v” = e A mismatch(u’, u” w”)). 


5 Test-free Register Transducers 


In [7], we introduced a restriction which allows to recover decidability of the 
bounded synthesis problem for specifications expressed as non-deterministic 
register automata. Applied to transducers, such restriction also yields polynomial 
complexities when considering the functionality and computability problems. 
An NRT T is test-free when its transition function does not depend on the 


tests conducted over the input data. Formally, we say that T is test-free if for all 
o,d|asgn,o 


transitions q a q we have ¢ = T. Thus, we can omit the tests altogether 


and its transition relation can be represented as A’ C Q x X x 2% x (T x R)* xQ. 


Example 22. Consider the function f : (X x D)” > (I x D)” associating, to 
x = (01, d1)(02,d2)..., the value (c1, d1)(02, d1)(03, d1)... if there are infinitely 
many a in x, and (c1, d2)(2, d2)(03, d2)... otherwise. 

f can be implemented using a test-free NRT with one register: it initially 
guesses whether there are infinitely many a in x, if it is the case, it stores dı in 
the single register r, otherwise it waits for the next input to get dy and stores it 
in r. Then, it outputs the content of r along with each o;. f is not continuous, as 
even outputting the first data requires reading an infinite prefix when dı Æ də. 
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Note that when a transducer is test-free, the existence of an accepting run over 
a given input x only depends on its finite labels. Hence, the existence of two 
outputs y and z which mismatch over data can be characterised by a simple 
pattern (Figure 6), which allows to decide functionality in polynomial time: 


Theorem 23. Deciding whether a test-free NRT is functional is in P'TIME. 


Proof. Let T be a test-free NRT such that T is not functional. Then, there exists 
xe (X xD)”, y,z E€ (I xD)” such that (x,y), (a, z) € [T] and y # z. Then, let 
i be such that y[i] 4 z[i]. There are two cases. Either lab(y[?]) 4 lab(z[¢]), which 
means that the finite transducer T’ obtained by ignoring the registers of T is not 
functional. By Proposition 8, such property can be decided in NLOGSPACE, so 
let us focus on the second case: dt(y[i]) 4 dt(z[i]). 


A $ - 
r’ € asgn’, - re on 


r is not reassigned 


°F] sag ~~~ fe. 


Fig. 6. A situation characterising the existence of a mismatch over data. Since acceptance 
does not depend on data, we can always choose x such that dt(x[j]) 4 dt(z[j’]). Here, 
we assume that the labels of x,y and z range over a unary alphabet; in particular 
yli] = x[j] iff dt(y[i]) = dt(x[j]). Finally, for readability, we did not write that r’ should 
not be reassigned between j’ and l’. Note that the position of i with regards to j, 7’, 1 
and l’ does not matter; nor does the position of l w.r.t. I’. 


We here give a sketch of the proof: observe that an input x admits two outputs 
which mismatch over data if and only if it admits two runs which respectively 
store |j] and 2[j’] such that x[j] # x[j’] and output them later at the same 
output position i; the outputs y and z are then such that dt(y[i]) Æ dt(z[7]). Since 
T is test-free, the existence of two runs over the same input x only depends on 
its finite labels. Then, the registers containing respectively x[j] and 2[j’] should 
not be reassigned before being output, and should indeed output their content 
at the same position i (cf Figure 6). Besides, again because of test-freeness, we 
can always assume that x is such that «[j] Æ «[j’]. Overall, such pattern can 
be checked by a 2-counter Parikh automaton, whose emptiness is decidable in 
PTIME [8] (under conditions that are satisfied here). 


Now, let us move to the case of continuity. Here again, the fact that test-free 
NRT conduct no test over the input data allows to focus on the only two registers 
that are responsible for the mismatch, the existence of an accepting run being 
only determined by finite labels. 
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Theorem 24. Deciding whether a test-free NRT defines a continuous function 
is in P'TIME. 


Proof. Let T be a test-free NRT. First, it can be shown that T is continuous if 
and only if T has the pattern of Figure 7, where r is coaccessible (since acceptance 
only depends on finite labels, T can be trimmed? in polynomial time). 


-0—00 -0—00 


Fig. 7. A pattern characterising non-continuity of functions defined by NRT, where 
we ask that there exist some states qf, q and r, where qf is accepting, as well as 
finite input data words u, v, z and finite output data words u’, v’, u”, v”, 2” such that 
mismatch(u’, u”) V (v” = € A mismatch(u’, u”z”)). Register assignments are not depicted, 
as there are no conditions on them. We unrolled the loops to highlight the fact that 


they do not necessarily loop back to the same configuration. 


Now, it remains to show that such simpler pattern can be checked in PTIME. 
We treat each part of the disjunction separately: 
(a) there exists u,u’,w”,v,v',u” s.t. io SEs qf na qf and io Li, q a 
q, where qs € F and mismatch(u’, u”). Then, as shown in the proof of 
Theorem 23, there exists a mismatch between some u’ and u” produced by 
the same input u if and only if there exists two runs and two registers r and 
r’ assigned at two distinct positions, and later on output at the same position. 
Such pattern can similarly be checked by a 2-counter Parikh automaton; the 
only difference is that here, instead of checking that the two end states are 
coaccessible with a common w-word, we only need to check that qf € F and 
that there is a synchronised loop over qf and q, which are regular properties 
that can be checked by the Parikh automaton with only a polynomial increase. 


| viv’ uju” vie 


+ 
(b) there exists u, u’, u”, v, v’, z, 2! s.t. io —— qf —> qf and ig —> q —> 


q 2; r, where qf € F and mismatch(u’, u”z”). By examining again the 


proof of Theorem 23, it can be shown that to obtain a mismatch, it suffices 
that the input is the same for both runs only up to position max(j, j’). More 
precisely, there is a mismatch between u’ and u’’z” if and only if there exists 


two registers r and r’ and two positions j, 7’ € {1,...,||u||} such that j 4 7’, 
r is stored at position j, r’ is stored at position 7’, r and r’ are respectively 
output at input positions l € {1,..., ||ul]} and V € {1,..., ||wz||} and they 


are not reassigned in the meantime. Again, such property, along with the 
fact that qf € F and the existence of a synchronised loop can be checked by 
a 2-counter Parikh automaton of polynomial size. 


Overall, deciding whether a test-free NRT is continuous is in P'TIME. 


3 We say that T is trim when all its states are both accessible and coaccessible. 
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Abstract. Downward closures of Petri net reachability sets can be finitely 
represented by their set of maximal elements called the minimal cover- 
ability set or Clover. Many properties (coverability, boundedness, ...) can 
be decided using Clover, in a time proportional to the size of Clover. So 
it is crucial to design algorithms that compute it efficiently. We present a 
simple modification of the original but incomplete Minimal Coverability 
Tree algorithm (MCT), computing Clover, which makes it complete: it 
memorizes accelerations and fires them as ordinary transitions. Contrary 
to the other alternative algorithms for which no bound on the size of the 
required additional memory is known, we establish that the additional 
space of our algorithm is at most doubly exponential. Furthermore we 
have implemented a prototype MinCov which is already very competi- 
tive: on benchmarks it uses less space than all the other tools and its 
execution time is close to the one of the fastest tool. 


Keywords: Petri nets - Karp-Miller tree algorithm - Coverability - Min- 
imal coverability set - Clover - Minimal coverability tree. 


1 Introduction 


Coverability and coverability set in Petri nets. Petri nets are iconic as 
an infinite-state model used for verifying concurrent systems. Coverability, in 
Petri nets, is the most studied property for several reasons: (1) many properties 
like mutual exclusion, safety, control-state reachability reduce to coverability, (2) 
the coverability problem is EXPSPACE-complete (while reachability is non ele- 
mentary), and (3) there exist efficient prototypes and numerous case studies. To 
solve the coverability problem, there are backward and forward algorithms. But 
these algorithms do not address relevant problems like the repeated coverability 
problem, the LTL model-checking, the boundedness problem and regularity of 
the traces. 

However these problems are EXPSPACE-complete [4,1] and are also decid- 
able using the Karp-Miller tree algorithm (KMT) [11] that computes a finite tree 


* The work was carried out in the framework of ReLaX, UMI2000 and also supported 
by ANR-17-CE40-0028 project BRAVAS. 
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labeled by a set of w-markings C C NË (where Nu is the set of naturals enlarged 
with an upper bound w and P is the set of places) such that the reachability set 
and the finite set C have the same downward closure in NP. Thus a marking m is 
coverable if there exists some m’ > m with m’ € C. Hence, C can be seen as one 
among all the possible finite representations of the infinite downward closure of 
the reachability set. This set C allows, for instance, to solve multiple instances 
of coverability in linear time linear w.r.t. the size of C avoiding to call many 
times a costly algorithm. Informally the KMT algorithm builds a reachability 
tree but, in order to ensure termination, substitutes w to some finite components 
of a marking of a vertex when some marking of an ancestor is smaller. 
Unfortunately C may contain comparable markings while only the maximal 
elements are important. The set of maximal elements of C can be defined in- 
dependently of the KMT algorithm and was called the minimal coverability set 
(MCS) in [6] and abbreviated as the Clover in the more general framework of 
Well Structured Transition Systems (WSTS) [7]. 
The minimal coverability tree algorithm. So in [5,6] the author computes 
the minimal coverability set by modifying the KMT algorithm in such a way 
that at each step of the algorithm, the set of w-markings labelling vertices is an 
antichain. But this aggressive strategy, implemented by the so-called Minimal 
Coverability Tree algorithm (MCT), contains a subtle bug and it may compute 
a strict under-approximation of Clover as shown in [8, 10]. 
Alternative minimal coverability set algorithms. Since the discovery of 
this bug, three algorithms (with variants) [10, 14,13] have been designed for 
computing the minimal coverability set without building the full Karp-Miller 
tree. In [10] the authors proposed a minimal coverability set algorithm (called 
CovProc) that is not based on the Karp-Miller tree algorithm but uses a similar 
but restricted introduction of w’s. In [14], Reynier and Servais proposed a mod- 
ification of the MCT, called the Monotone-Pruning algorithm (called MP), that 
keeps but “deactivates” vertices labeled with smaller w-markings while MCT 
would have deleted them. Recently in [15], the authors simplified their original 
proof of correctness. In [16], Valmari and Hansen proposed another algorithm 
(denoted below as VH) for constructing the minimal coverability set without 
deleting vertices. Their algorithm builds a graph and not a tree as usual. In [13], 
Piipponen and Valmari improved this algorithm by designing appropriate data 
structures and heuristics for exploration strategy that may significantly decrease 
the size of the graph. 
Our contributions. 


1. We introduce the concept of abstraction as an w-transition that mimics the 
effect of an infinite family of firing sequences of markings w.r.t. coverabil- 
ity. As a consequence adding abstractions to the net does not modify its 
coverability set. Moreover, the classical Karp-Miller acceleration can be for- 
malized as an abstraction whose incidence on places is either w or null. The 
set of accelerations of a net is upward closed and well-ordered. Hence there 
exists a finite subset of minimal accelerations and we show that the size of 
all minimal acceleration is bounded by a double exponential. 
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2. Despite the current opinion that ”The flaw is intricate and we do not see 
an easy way to get rid of it....Thus, from our point of view, fixing the bug 
of the MCT algorithm seems to be a difficult task” [10], we have found a 
simple modification of MCT which makes it correct. It mainly consists in 
memorizing discovered accelerations and using them as ordinary transitions. 

3. Contrary to all existing minimal coverability set algorithms that use an un- 
known additional memory that could be non primitive recursive, we show, by 
applying a recent result of Leroux [12], that the additional memory required 
for accelerations, is at most doubly exponential. 

4. We have developed a prototype in order to also empirically evaluate the 
efficiency of our algorithm and the benchmarks (either from the literature or 
random ones) have confirmed that our algorithm requires significantly less 
memory than the other algorithms and is close to the fastest tool w.r.t. the 
execution time. 

Organization. Section 2 introduces abstractions and accelerations and studies 
their properties. Section 3 presents our algorithm and establishes its correctness. 
Section 4 describes our tool and discusses the results of the benchmarks. We 
conclude and give some perspectives to this work in Section 5. One can find all 
the missing proofs and an illustration of the behavior of the algorithm in [9]. 


2 Covering abstractions 


2.1 Petri nets: reachability and covering 


Here we define Petri nets differently from the usual way but in an equivalent 
manner. i.e. based on the backward incidence matrix Pre and the incidence 
matrix C. The forward incidence matrix is implicitly defined by C + Pre. Such 
a choice is motivated by the introduction of abstractions in section 2.2. 


Definition 1. A Petri net (PN) is a tuple N = (P,T,Pre,C) where: 


— P is a finite set of places; 

— T is a finite set of transitions, with POT = 0; 

— Pre c N?*? is the backward incidence matrix; 

— Ce ZPT is the incidence matrix which fulfills: 
for allp€ P andt €T, C(p,t) + Pre(p,t) > 0. 


A marked Petri net (N, mo) is a Petri net N equipped with an initial marking 
mo € NP. 


The column vector of matrix Pre (resp. C) indexed by t € T is denoted 
Pre(t) (resp. C(t)). A transition t € T is fireable from a marking m € N” if m > 
Pre(t). When t is fireable from m, its firing leads to marking m’ m+ C(t), 
denoted by m —'; m’. One extends fireability and firing to a sequence o € T* 
by recurrence on its length. The empty sequence € is always fireable and let the 
marking unchanged. Let o = to’ be a sequence with t € T and o’ € T*. Then o 
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is fireable from m if m —5 m’ and o’ is fireable from m’. The firing of ø from 
m leads to the marking m” reached by o’ from m’. One also denotes this firing 
by m —25 m”. 


Definition 2. Let (N,mo) be a marked net. The reachability set Reach( N, mo) 
is defined by: 


Reach(N,mo) = {m | Jo € T* mo —— m} 


In order to introduce the coverability set of a Petri net, let us recall some 
definitions and results related to ordered sets. Let (X, <) be an ordered set. The 
downward (resp. upward) closure of a subset E C X is denoted by | E (resp. 
TE) and defined by: 


{E={rexX |ayeEy>z} (resp. TE ={xw eX | dye Ey<-z}) 


A subset E C X is downward (resp. upward) closed if E =| E (resp. E =f E). 

An antichain E is a set which fulfills: Ve A y E€ E A(x < y Vy < x). X is 
said FAC (for Finite AntiChains) if all its antichains are finite. A non empty 
set E C X is directed if for all x,y € E there exists z € E such that x < z and 
y < z. An ideal is a set which is downward closed and directed. There exists 
an equivalent characterization of FAC sets which provides a finite description of 
any downward closed set: a set is FAC if and only if every downward closed set 
admits a finite decomposition in ideals (a proof of this well-known result can be 
found in [3]). 

X is well founded if all its (strictly) decreasing sequences are finite. X is well 
ordered if it is FAC and well founded. There are many equivalent characteriza- 
tions of well order. For instance, a set X is well ordered if and only if for all 
sequence (£n)nen in X, there exists a non decreasing infinite subsequence. This 
characterization allows to design algorithms that computes trees whose finiteness 
is ensured by well order. Let us recall that (N, <) and (N”,<) are well ordered 
sets. 

We are now ready to introduce the cover (also called the coverability set) of 
a net and to state some of its properties. 


Definition 3. Let (NV,mo) be a marked Petri net. Cover(N, mo), its coverabil- 
ity set, is defined by: 


Cover(N, mo) =| Reach(N, mo) 


Since the coverability set is downward closed and N? is FAC, it admits a 
finite decomposition in ideals. The ideals of N? can be defined in an elegant way 
as follows. One first extends the sets of naturals and integers: No = NU {w} 
et Zu = ZU {w}. Then one extends the order relation and the addition to Zo: 
for all n € Z, w > n and for all n € Z,,n+w=wt+n=w. NP is also a 
well ordered set and its members are called w-markings. There is a one-to-one 
mapping between ideals of N? and w-markings. Let m € NZ. Define [m] by: 


[m] = {m € N? | m < m} 
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[m] is an ideal of N? (and all ideal can be defined in such a way). Let Q be a 
set of w-markings, [2] denotes the set Umeglm]. Due to the above properties, 
there exists a unique finite set with minimal size Clover(N, mo) C N? such that: 


Cover(N, mo) = [Clover(N, mo)] 
A more general result can be found in [3] for well structured transition systems. 


Example 1. The marked net of Figure 1 is unbounded. Its Clover is the following 
set: 
{Dis Pok + Pm, Pi + Pm + WPba, Pi + Pok + WPba + Wc} 


For I the marking pı +Pbk +aPba +8Pe is reached thus covered by sequence 
tite Pti. 


ti Pl ts 
On} S Poa 
t2 bk 
Pi Ox E 
t3 t4 


O Ör 


Fig. 1. An unbounded Petri net 


2.2 Abstraction and acceleration 


In order to introduce abstractions and accelerations, we generalize the transitions 
to allow the capability to mark a place with w tokens. 


Definition 4. Let P be a set of places. An w-transition a is defined by: 


— Pre(a) € NẸ its backward incidence; 
— C(a) € ZË its incidence with Pre(a) + C(a) > 0. 


For sake of homogeneity, one denotes Pre(a)(p) (resp. C(a)(p)) by Pre(p, a) 
(resp. C(p,a)). An w-transition a is fireable from an w-marking m € NẸ if 
m > Pre(a). When a is fireable from m, its firing leads to the w-marking m’ df 
m + C(a), denoted as previously m —*} m’. One observes that if Pre(p, a) = w 
then for all values of C(p,a), m'(a) = w. So without loss of generality, one 
assumes that for all w-transition a, Pre(p,a) = w implies C(p,a) = w. 

In order to define abstractions, we first define the incidences of a sequence ø of 


w-transitions by recurrence on its length. As previously, we denote Pre(p, o) oe 
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Pre(c)(p) and C(p, o) = E C(c)(p). The base case corresponds to the definition 
of an w-transition. Let o = to’, with t an w-transition and o’ a sequence of 
w-transitions, then: 


= C(e) = C(t) + Clo"); 
— for all p € P 
e if C(p,t) = w then Pre(p,o) = Pre(p, t); 
e else Pre(p, o) = max(Pre(p, t), Pre(p, 0’) — C(p,t)). 


One checks by recurrence that ø is firable from m if and only if m > Pre(c) 
and in this case, m -= m+ C(o). 

An abstraction of a net is an w-transition which concisely expresses the be- 
haviour of the net w.r.t. covering (see Proposition 1). One will observe that a 
transition t of a net is by construction (with op = t for all n) an abstraction. 


Definition 5. Let N = (P,T,Pre,C) be a Petri net and a be an w-transition. 
a is an abstraction if for all n > 0, there exists on E€ T* such that for all p € P 
with Pre(p,a) E€ N: 


1. Pre(p, En) Z < Pre(p, a); 
3. If C(p, a) = = Ww then dio, TA = n. 


The following proposition justifies the interest of abstractions. 


Proposition 1. Let (N, mo) be a marked Petri net, a be an abstraction and m 
be an w-marking such that: |m] C Cover(N, mo) and m > w. Then |w] C 
Cover(N, mo). 


Proof. Pick some m* € [m’]. Denote n = max(m*(p) | m’ (p) = w) 
and £ = max(Pre(p, on), — C(p, on) | m(p) = w). Let us define mË € |m] by: 


— If m(p) < w then mË (p) = m(p); 
— Else m? (p) = £. 


Let us check that øn is fireable from mË. Let p € P, 


— If m(p) < w then mË (p) = m(p) > Pre(p,a) > Pre(p, on); 
— Else mË (p) = £ > Pre(p, on). 


Let us show that mË + C(on) > m*. Let p € P, 


— If m(p) < w and C(p,a) < w then m#(p) + C(p, cn) > m(p) + C(p,a) = 


m'(p) > m* (p); 


— If m(p) < w and C(p,a) = w then mê (p) + C(p,on) > C(p,on) > n = 


m*(p) ; 


— if m(p) = w then mË(p) +C(p, on) > n— C(p,on) + C(p, on) = n > m*(p). 


An easy way to build new abstractions consists in concatenating them. 


Minimal Coverability Tree Construction Made Complete and Efficient 243 


Proposition 2. Let N = (P,T,Pre,C) be a Petri net and o be a sequence of 
abstractions. Then the w-transition a defined by Pre(a) = Pre(c) and C(a) = 
C(c) is an abstraction. 


We now introduce the underlying concept of the Karp and Miller construc- 
tion. 


Definition 6. Let N = (P,T,Pre,C) be a Petri net. One says that a is an 
acceleration if a is an abstraction such that C(a) € {0,w}”. 


The following proposition provides a way to get an acceleration from an 
arbitrary abstraction. 


Proposition 3. Let N = (P,T,Pre,C) be a Petri net and a be an abstraction. 
Define a’ an w-transition as follows. For all p € P: 


— If C(p,a) < 0 then Pre(p, a’) = C(p, a’) = w; 
— If C(p,a) = 0 then Pre(p, a’) = Pre(p,a) and C(p,a’) = 0; 
— If C(p,a) > 0 then Pre(p,a’) = Pre(p,a) and C(p,a’) = w. 


Then a’ is an acceleration. 


Let us study more deeply the set of accelerations. First we equip the set of 
w-transitions with a“natural” order w.r.t. covering. 


Definition 7. Let P be a set of places and two w-transitions a and a’. 
a<a’ if and only if Pre(a) < Pre(a’) A C(a) > C(a’) 


In other words, a < a’ if given any w-marking m, if a’ is fireable from m 
then a is also fireable and its firing leads to a marking greater or equal that the 
one reached by the firing of a’. 


Proposition 4. Let N be a Petri net. Then the set of abstractions of N is 
upward closed. Similarly, the set of accelerations is upward closed in the set of 
w-transitions whose incidence belongs to {0,w}?. 


Proposition 5. The set of accelerations of a Petri net is well ordered. 


Proof. The set of accelerations is a subset of N? x {0,w}” (where P is the set 
of places) with the order obtained by iterating cartesian products of sets (N, <) 
and ({0,w}, >). These sets are well ordered and the cartesian product preserves 
this property. So we are done. E 

Since the set of accelerations is well ordered and it is upward closed, it is equal 
to the upward closure of the finite set of minimal accelerations. Let us study the 
size of a minimal acceleration. Given some Petri net, one denotes d = |P| and 
e = max, ;(max(Pre(p, t), Pre(p, t) + C(p,t)). 

We are going to use the following result of Jérôme Leroux (published on 
HAL in June 2019) which provides a bound for the lengths of shortest sequences 
between two markings m; and mə mutually reachable. 
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Theorem 1. (Theorem 2, [12]) Let N be a Petri net, mi,m2 be markings, 
01,02 be sequences of transitions such that my, 24 m = mı. Then there exist 


ol ,oh such that mi = m => m; fulfilling: 


loos] < |m; — malla (Bde) 


One deduces an upper bound on the size of minimal accelerations. 
Let v € NË. One denotes ||v||.. = max(v(p) | v(p) € N). 


Proposition 6. Let N be a Petri net and a be a minimal acceleration. 
Then ||Pre(a)||o. < e(3de) 0+0? 


Proof. Let us consider the net N’ = (P’,T’,Pre’,C’) obtained from N by 
deleting the set of places {p | Pre(p,a) = w} and adding the set of transitions 
Tı = {tp | p € P’} with Pre(t,) = p et C(t,) = —p. Observe that d’ < d and 
e =e. 

One denotes P, = {p | Pre(p,a) < w = C(p,a)}. One introduces m; the 
marking obtained by restricting Pre(a) to P’ and m = m; + )/,¢p, P- 

Let {on}nen be a family of sequences associated with a. Let n* = ||Pre(a)||.o+1. 
Then op» is fireable in NV’ from m; and its firing leads to a marking that covers 
mə. By concatenating some occurrences of transitions of T}, one gets a firing 
sequence in NV’ mı <4 mg. Using the same process, one gets a firing sequence 
mə EN mı. 

Let us apply Theorem 1. There exists a sequence g| with mı 1, mə and |a}| < 
(3de)@+)**™ since ||m4—my||,, = 1. By deleting the transitions of T} occurring 


in ø|, one gets a sequence of € T* such that mj e4 m, > m with |of| < 
(d41)24+4 

(3de) f 
The w-transition a’, defined by Pre(p, a’) = Pre(p, 0/) for all p € P’, Pre(p,a’) = 
w for all p € P \ P’ and C(a’) = C(a), is an acceleration whose associated 
family is {o//"}nen. By definition of mj, a’ < a. Since a is minimal, a’ = a. 
Observing that |o””| < (3de) 019? 
e(3de) (1D? 


, one gets ||/Pre(a)||.. = ||/Pre(a’)||. < 


E 
Thus given any acceleration, one can easily obtain a smaller acceleration 
whose (representation) size is exponential. 
Proposition 7. Let N be a Petri net and a be an acceleration. 
Then the w-transition trunc(a) defined by: 
— C(trunc(a)) = C(a); 
— for all p such that Pre(p,a) # w, 
Pre(p, trunc(a)) = min(Pre(p, a), e(3de)(4+4 
— for all p such that Pre(p, a) = w, Pre(p, trunc(a)) = w. 


rat ) ; 


is an acceleration. 


Proof. Let a’ < a, be a minimal acceleration. For all p such that Pre(p, a) 4 w, 
Pre(p,a’) < e(3de)4+)""™, So a! < trunc(a). Since the set of accelerations is 
upward closed, one gets that trunc(a) is an acceleration. E 
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3 A coverability tree algorithm 


3.1 Specification and illustration 


As discussed in the introduction, to compute the clover of a Petri net, most 
algorithms build coverability trees (or graphs), which are variants of the Karp 
and Miller tree with the aim of reducing the peak memory during the execution. 
The seminal algorithm [6] is characterized by a main difference with the KMT 
construction: when finding that the marking associated with the current vertex 
strictly covers the marking of another vertex, it deletes the subtree issued from 
this vertex, and when the current vertex belonged to the removed subtree it sub- 
stitutes it to the root of the deleted subtree. This operation drastically reduces 
the peak memory but as shown in [8] entails incompleteness of the algorithm. 

Like the previous algorithms that ensure completeness with deletions, our 
algorithm also needs additional memory. However unlike the other algorithms, 
it memorizes accelerations instead of w-markings. This approach has two advan- 
tages. First, we are able to exhibit a theoretical upper bound on the additional 
memory which is doubly exponential, while the other algorithms do not have 
such a bound. Furthermore, accelerations are reused in the construction and 
thus may even shorten the execution time and peak space w.r.t. the algorithm 
in [6]. 

Before we delve into a high level description of this algorithm, let us present 
some of the variables, functions, and definitions used by the algorithm. Algorithm 
1, denoted from now on as MinCov takes as an input a marked net (M, mo) 
and constructs a directed labeled tree CT = (V,E,A,6), and a set Acc of w- 
transitions (which by Lemma 2 are accelerations). Each v € V is labeled by an 
w-marking, \(v) € NË. Since CT is a directed tree, every vertex v € V, has 
a predecessor (except the root r) denoted by prd(v) and a set of descendants 
denoted by Des(v). By convention, prd(r) = r. Each edge e € E is labeled by a 
firing sequence 6(e) € T,,-Acc*, consisting of an ordinary transition followed by a 
sequence of accelerations (which by Lemma 1 fulfills \(prd(v)) pli X(v)). 


O(rr 
In addition, again by Lemma 1, mo aon, A(r). Let y = e1€2...e~ € E* be 


a path in the tree, we denote by 6(7) := ô(e1)ô(e2) ... lep) € (T UAcc)*. The 
subset Front C V is the set of vertices ‘to be processed’. 

MinCov may call function Delete(v) that removes from V a leaf v of CT and 
function Prune(v) that removes from V all descendants of v € V except v itself 
as illustrated in the following figure: 


v A u Delete(u) v a Prune(v) v 
a ?; a 


O O O O 


First MinCov does some initializations, and sets the tree CT to be a single 
vertex r with marking A(r) = mọ and Front = {r}. Afterwards the main loop 
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builds the tree, where each iteration consists in processing some vertex in Front 
as follows. 

MinCov picks a vertex u € Front (line 3). From A(u), MinCov fires a sequence 
a € Acc* reaching some m, that maximizes the number of w produced, i.e. 
l{p € P | XA(u)(p) 4 w Am, (p) = w}|. Thus in øg, no acceleration occurs twice 
and its length is bounded by |P]|. Then MinCov updates A(u) with m, (line 5) and 
the label of the edge incoming to u by concatenating o. Afterwards it performs 
one of the following actions according to the marking A(u): 


— Cleaning (line 7): If there exists u’ € V \ Front with A(u’) > A(u). The 
vertex u is redundant and MinCov calls Delete(u) 

— Accelerating (lines 8-16): If there exists u’, an ancestor of u with A(u’) < 
A(u) then an acceleration can be computed. The acceleration a is deduced 
from the firing sequence labeling the path from wu’ to u. MinCov inserts a into 
Acc, calls Prune(u’) and pushes back u’ in Front. 

— Exploring (lines 18 - 25): Otherwise MinCov calls Prune(wu’) followed by 
Delete(u’) for all u’ € V with A(u’) < A(u) since they are redundant. 
Afterwards, it removes u from Front and for all fireable transition t € T 
from A(u), it creates a new child for u in CT and inserts it into Front. 


For a detailed example of a run of the algorithm see Example 2 in [9]. 


3.2 Correctness Proof 


We now establish the correctness of Algorithm 1 by proving the following prop- 
erties (where for all W C V, \(W) denotes U,,cyw A(v)): 


— its termination; 

— the incomparability of w-markings associated with vertices in V: 
A(V) is an antichain; 

— its consistency: [A(V)] © Cover(N, mo); 

— its completeness: Cover(N, mo) © [A(V)]. 


We get termination by using the well order of NẸ and Koenig Lemma. 
Proposition 8. MinCov terminates. 


Proof. Consider the following variation of the algorithm. 


Instead of deleting the current vertex when its marking is smaller or equal than 
the marking of a vertex, one marks it as ‘cut’ and extract it from Front. 


Instead of cutting a subtree when the marking of the current vertex v is greater 
than the marking of a vertex which is not an ancestor of v, one marks them as 
‘cut’ and extract from Front those who are inside. 


Instead of cutting a subtree when the marking of the current vertex v is greater 
than the marking of a vertex which is an ancestor of v, say v*, one marks those 
on the path from v* to v (except v) as ‘accelerated’, one marks the other vertices 
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Algorithm 1: Computing the minimal coverability set 
MinCov(NV, mo) 
Input: A marked Petri net (M, mo) 
Data: V set of vertices; E C V x V; Front CV; A: V => N}; 6: E > T Acc*; 
CT = (V, E, à, ô) a labeled tree; Acc a set of w-transitions; 
Output: A labeled tree CT = (V, E, A, ô) 
1 Ve {r}; E <Q; Front + {r}; A(r) — mo; Acc + Q; (r,r) +} € 
2 while Front #0 do 


3 Select u € Front 
4 Let o € Acc* a maximal fireable sequence of accelerations from A(u) 
// Maximal w.r.t. the number of w’s produced 
5 Alu) + Alu) + C(c) 
e | 8((prd(u),u)) < 6((prd(u),u)) -o 
7 if du’ € V \ Front s.t. A(u’) > A(u) then Delete(u) // A(u) is covered 
8 else if du’ € Anc(V) s.t. A(u) > A(u’) then 
// An acceleration was found between u and one of u’s 
ancestors 
9 Let y € E* the path from u’ to u in CT 
10 a + NewAcceleration() 
11 foreach p € P do 
12 if C(p,ô(y)) < 0 then Pre(p,a) + w; C(p,a) + w 
13 if C(p, 6(y)) = 0 then Pre(p,a) + Pre(p, ô(y)); C(p,a) + 0 
14 if C(p,ô(y)) > 0 then Pre(p,a) + Pre(p, ô(y)); C(p, a) Hw 
15 end 
16 a & trunc(a); Acc + Acc U {a}; Prune(u’); Front = Front U {u’} ; 
17 else 
18 for u’ € V do 
// Remove vertices labeled by markings covered by A(u) 
19 if A(u’) < A(u) then Prune(u’); Delete(u’) 
20 end 
21 Front < Front \ {u} 
22 foreach t € T ^ \(u) > Pre(t) do 
// Add the children of u 
23 u' + NewNode(); V + V U {u’}; Front + Front U {u’}); 
E + Ev{(u,u’)} 
24 Alu’) H Alu) + C(t); &((u,u')) = t 
25 end 
26 end 
27 end 


28 return CT 
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of the subtree as ‘cut’ and inserts v again in Front with the marking of v*. All 
the markings of the subtree in Front are extracted from it. 


All the vertices marked as ‘cut’ or ‘accelerated’ are ignored for comparisons and 
discovering accelerations. This alternative algorithm behaves as the original one 
except that the size of the tree never decreases and so if the algorithm does 
not terminate the tree is infinite. Since this tree is finitely branching, due to 
Koenig Lemma it contains an infinite path. On this infinite path, no vertex can 
be marked as ‘cut’ since it would belong to a finite subtree. Observe that the 
marking labelling the vertex following an accelerated subpath has at least one 
more w than the marking of the first vertex of this subpath. So there is an infinite 
subpath with unmarked vertices in V. But NË is well-ordered, so there should 
be two vertices v and v’, where v’ is a descendant of v with A(v') > A(v), which 
contradicts the behaviour of the algorithm. 
Oo 
Since we are going to use recurrence on the number of iterations of the main 
loop of Algorithm 1, we introduce the following notations: CT, = (Vn, En, An; ôn), 
Front, and Acc, are the the values of variables CT, Front, and Acc at line 2 
when n iterations have been executed. 


Proposition 9. For all n € N, A(Vn \ Front,) is an antichain. Thus on termi- 
nation, A(V) is an antichain. 


Proof. Let us introduce V’ := V \ Front and V} := Vn \ Front,. We are going 
to prove by induction on the number n of iterations of the while-loop that V; is 
an antichain. MinCov initializes variables V and Front at line 1. So Vo = {r} and 
Fronto = {r}, therefore Vj = Vo \ Fronto = @ is an antichain. 

Assume that V! = V, \ Front, is an antichain. Modifying V! can be done by 
adding or removing vertices from V, and removing vertices from Front, while 
keeping them in V,,. The actions that MinCov may perform in order to modify the 
sets V and Front are: Delete (lines 7 and 19), Prune (lines 16 and 19), adding 
vertices to V (line 23), adding vertices to Front (lines 16 and 23), and removing 
vertices from Front (line 21). 

e Both Delete and Prune do not add new vertices to V’. Thus the antichain 
feature is preserved. 

e MinCov may add vertices to V only at line 23 where it simultaneously adds 
them to Front and therefore does not add new vertices to V’. Thus the antichain 
feature is preserved. 

e Adding vertices to Front may only remove vertices from V,’. Thus the antichain 
feature is preserved. 

e MinCov can only add a vertex to V’ when it removes it from Front while keeping 
it in V. This is done only at line 21. There the only vertex MinCov may remove 
(line 21) is the working vertex u. However if (in the iteration) MinCov reaches 
line 21 then it did not reach line 7 hence, (1) all markings of A(V/) C A(Vn) are 
either smaller or incomparable to An+ı(u). Moreover, MinCov has also reached 
line 18-20, where (2) it performs Delete on all vertices u’ € V} C V, with 
An(u’) < An+i(u). Let us denote by V? C V} the set V’ at the end of line 
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20. Due to (1) and (2), marking An+ı(u) is incomparable to any marking in 

AntilV,"). Since V? C V}, Angi (V,Z") is an antichain. Combining this fact with 

the incomparability between \,,41(u) and any marking in \,,41(V,’), we conclude 
that the set An41(Vi41) = Angi (Vy) U {Anqi(u)} is an antichain. 

E 

In order to establish consistency, we prove that the labelling of vertices and 


edges is compatible with the firing rule and that Acc is a set of accelerations. 


Lemma 1. For alln € N, for all u € Va \ {r}, An(prd(u)) Sprat) w, An (u) 


SED, alr). 


and mo 


Proof. Let us prove by induction on the number n of iterations of the main loop 
that for all v € Vp, the assertions of the lemma hold. Initially, Vo = {r} and 
Ao(r) = mo. Since mp —= mo = ào(r) the base case is established. 
Assume that the assertions hold for C'T,,. Observe that MinCov may change the 
labeling function A and/or add new vertices in exactly two places: at lines 4-6 
and at lines 22-25. Therefore in order to prove the assertion, we show that after 
each group of lines it still holds. 
e After lines 4-6: MinCov computes (1) a maximal fireable sequence o € Acc;, 
from An(u) (line 4), and updates u’s marking to m, = An(u) + C(c) (line 5). 
Since the assertions hold for CT,,, (2) if u Æ r, An(prd(u)) a, An(u) else 
Mo a An(r). By concatenation, we get A, (prd(u)) Ee ai M, ifuAr 
and otherwise mo ae, m, which establishes that the assertions hold after 
line 6. 
e After lines 22-25: The vertices for which A is updated at these lines are the 
children of u that are added to the tree. For every fireable transition t € T from 
A(u), MinCov creates a child v, for u (lines 22-23). The marking of any child 
v; is set to mp41(V) := my41(u) + C(t) (line 24). Therefore since An41(u) $ 
An+1(Ut), the assertions hold. 

E 


Lemma 2. At any execution point of MinCov, Acc is a set of accelerations. 


Proof. At most one acceleration is added per iteration. Let us prove by induction 
on the number n of iterations of the main loop that Acc, is a set of accelerations. 
Since Acco = Ø, the base case is straightforward. 

Assume that Acc, is a set of accelerations and consider Accn+1. In an itera- 
tion, MinCov may add an w-transition a to Acc. Due to the inductive hypothe- 
sis, 0(7y) is a sequence of abstractions where y is defined at line 9. Consider b, 
the w-transition defined by Pre(b) = Pre(ô(y)) and C(b) = C(d(7)). Due to 
Proposition 2, b is an abstraction. Due to Proposition 3, the loop of lines 11-15 
transforms b into an acceleration a. Due to Proposition 7, after truncation at 
line 16, a is still an acceleration. E 


Proposition 10. [A(V)] © Cover(N, mo). 
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Proof. Let v € V. Consider the path uo,...,ux of CT from the root r = uo 
to uz = v. Let o € (T UAcc)* denote d(prd(uo), uo) ---d(prd(ug), uz). Due to 
Lemma 1, mo -= A(v). Due to Lemma 2, ø is a sequence of abstractions. Due to 
Proposition 2, the w-transition a defined by Pre(a) = Pre(c) and C(a) = C(c) 
is an abstraction. Due to Proposition 1, [A(v)] CG Cover(N, mo). E 

The following definitions are related to an arbitrary execution point of MinCov 
and are introduced to establish its completeness. 


Definition 8. Leto = ooti0,...tpo, with for alli, ti € T and o; € Acc*. Then 
the firing sequence m —24 w’ is an exploring sequence if: 


— There exists v € Front with \(v) =m 
— For all0 < i< k, there does not exist v’ € V \ Front 
with m + C(a9ti01 os tisi) < Aw’). 


Definition 9. Let M be a marking. Then M is quasi-covered if: 


— either there exists v € V \ Front with A(v) > M; 
— or there exists an exploring sequence m 25 w’ > M. 


In order to prove completeness of the algorithm, we want to prove that at 
the beginning of every iteration, any m € Cover(N, mo) is quasi-covered. To 
establish this assertion, we introduce several lemmas showing that this assertion 
is preserved by some actions of the algorithm with some prerequisites. More pre- 
cisely, Lemma 3 corresponds to the deletion of the current vertex, Lemma 4 to the 
discovery of an acceleration, Lemma 5 to the deletion of a subtree whose mark- 
ing of the root is smaller than the marking of the current vertex and Lemma 6 
to the creation of the children of the current vertex. 


Lemma 3. Let CT, Front and Acc be the values of corresponding variables at 
some execution point of MinCov andu € V be a leaf in CT such that the following 
items hold: 


1. All m € Cover(N,mpo) are quasi-covered; 
2. A(V \ Front) is an antichain; 
3. For alla € Acc fireable from X(u), A(u) 


= Au) + C(a); 
4. There exists v € V \ {u} such that A(v) > A(u). 


(u) 


Then all m € Cover(N, mo) are quasi-covered after performing Delete(u). 


Lemma 4. Let CT, Front and Acc be the values of corresponding variables at 
some execution point of MinCov. and u E V such that the following items hold: 


1. All m € Cover(N,mpo) are quasi-covered; 
2. A(V \ Front) is an antichain; 
3. For allv € V \ {r}, A(prd(v)) eee Aw). 


Then all m € Cover(N,mg) are quasi-covered after performing Prune(u) and 
then adding u to Front. 
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Lemma 5. Let CT, Front and Acc be the values of corresponding variables at 
some execution point of MinCov, u € Front and u’ € V such that the following 
items hold: 


1. Allm € Cover(N,mo) are quasi-covered; 

2. X(V \ Front) is an antichain; 

3. For allv € V \ {r}, A(prd(v)) a A(v); 
4. A(u’) < X(u) and u is not a descendant of u’. 


Then after performing Prune(u’); Delete(u’), 


1. All m € Cover(N, mo) are quasi-covered; 
2. X(V \ Front) is an antichain; 


3. For allu € V \ {r}, A(prd(v)) — A(v). 


Lemma 6. Let CT, Front and Acc be the values of corresponding variables at 
some execution point of MinCov. and u € Front such that the following items 
hold: 


1. Allm € Cover(N, mo) are quasi-covered; 
2. A(V \ Front) U {A(u)} is an antichain; 
3. For alla € Acc fireable from A(u), A(u) = A(u) + C(a). 


Then after removing u from Front and for allt € T fireable from A(u), adding 
a child vy, to u in Front with marking of vs defined by Au (v+) = A(w) + C(t), all 
m € Cover(N,mo) are quasi-covered. 


Proposition 11. At the beginning of every iteration, all m € Cover(N, mo) 
are quasi-covered. 


Proof. Let us prove by induction on the number of iterations that all m € 
Cover(N, mg) are quasi-covered. 

Let us consider the base case. MinCov initializes V and Front to {r} and A(r) to 
mo. By definition, for all m € Cov(N, mo) there exists o = tito -+ tp € T* such 
that mp “> m’ > m. Since V \ Front = 0, this firing sequence is an exploring 
sequence. 

Assume that all m € Cover(N, mo) are quasi-covered at the beginning of some 
iteration. Let us examine what may happen during the iteration. In lines 4-6, 
MinCov computes the maximal fireable sequence øo € Acc; from A,,(u) (line 4) 
and sets u’s marking to m, := An(u) + C(c) (line 5). Afterwards, there are 
three possible cases: (1) either m, is covered by some marking associated with a 
vertex out of Front, (2) either an acceleration is found, (3) or MinCov computes 
the successors of u and removes u from Front. 


Line 7. MinCov calls Delete(u). So CT,41 is obtained by deleting u. More- 
over, A(u’) > m,,. Let us check the hypotheses of Lemma 3. Assertion 1 
follows from induction since (1) the only change in the data is the increas- 
ing of A(u) by firing some accelerations and (2) u belongs to Front so cannot 
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cover intermediate markings of exploring sequences. Assertion 2 follows from 
Proposition 9 since V \ Front is unchanged. Assertion 3 follows immediately 
from lines 4-6. Assertion 4 follows with v = u’. Thus using this lemma the 
induction is proved in this case. 

Lines 8-16. Let us check the hypotheses of Lemma 4. Assertions 1 and 2 are 
established as in the previous case. Assertion 3 holds due to Lemma 1, and 
the fact that no edge has been added since the beginning of iteration. Thus 
using this lemma the induction is proved in this case. 

Lines 18-25. We first show that the hypotheses of Lemma 6 hold before line 21. 
Let us denote the values of CT and Front after line 20 by Ct. and Front,,. 
Observe that for all iteration of Line 19 in the inner loop, the hypotheses 
of Lemma 5 are satisfied. Therefore, in order to apply Lemma 6 it remains 
only to check assertions 2 and 3 of this lemma. Assertion 2 holds since (1) 
A(V \ Front) is an antichain, (2) due to Line 7 there is no w € V \ Front such 
that A(w) > A(u), and (3) by iteration of Line 19 all w € V \ Front such that 
A(w) < A(u) have been deleted. Assertion 3 holds due to Line 5 (all useful 
enabled accelerations have been fired) and Line 8 (no acceleration has been 
added). 


Lines 21-25 correspond to the operations related to Lemma 6. Thus using 
this lemma, the induction is proved in this case. 


E 
The completeness of MinCov is an immediate consequence of the previous 
proposition. 


Corollary 1. When MinCov terminates, Cover(N,mo) C [A(V)]. 


Proof. By Proposition 11 all m € Cover(N,mo) are quasi-covered. Since on 
termination, Front is empty for all m € Cover(N, mo), there exists v € V such 
that m < A(v). E 


4 Tool and benchmarks 


In order to empirically evaluate our algorithm, we have implemented a prototype 
tool which computes the clover and solves the coverability problem. This tool is 
developed in the programming language Python, using the Numpy library. It can 
be found on GitHub®. All benchmarks were performed on a computer equipped 
by Intel i5-8250U CPU with 4 cores, 16GB of memory and Ubuntu Linux 18.03. 


Minimal coverability set. We compare MinCov with the tool MP [14], the tool 
VH [16], and the tool CovProc [10]. We have also implemented the (incomplete) 
minimal coverability tree algorithm denoted by AF in order to measure the ad- 
ditional memory needed for the (complete) tools. Both MP and VH tools were 
sent to us by the courtesy of the authors. The tool MP has an implementation 


3 https: //github.com/IgorKhm/MinCov 
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in Python and another in C++. For comparison we selected the Python one to 
avoid biases due to programming language. 

We ran two kinds of benchmarks: (1) 123 standard benchmarks from the 
literature in Table 1, (which were taken from [2]), (2) 100 randomly generated 
Petri nets also in Table 1, since the benchmarks from the literature do not 
present all the features that lead to infinite state systems. These random Petri 
nets have the following properties: (1) 50 < |P|, |T| < 100, (2) the number 
of places connected of each transition is bounded by 10, and (3) they are not 
structurally bounded. The execution time of the tools was limited to 900 seconds. 

Table 1 contains a summary of all the instances of the benchmarks. The first 
column shows the number of instances on which the tool timed out. The time 
column consists of the total time on instances that did not time out plus 900 
seconds for any instance that led to a time out. The #Nodes column consists of 
the peak number of nodes in instances that did not time out on any of the tools 
(except CovProc which does not provide this number). For MinCov we take the 
peak number of nodes plus accelerations. In the benchmarks from the literature 


Table 1. Benchmarks for clover 


123 benchmarks from the literature 100 random benchmarks 


T/O Time #Nodes T/O Time #Nodes 
MinCov 16 18127 48218 MinCov 14 13989 61164 
VH 15 14873 75225 VH 15 13692 208134 
MP 24 23904 478681 MP 21 21726 755129 
CovProc 49 47081 N/A CovProc 80 74767 N/A 


AF 19 19223 45660 AF 16 15888 63275 


we observed that the instances that timed out from MinCov are included in 
those of AF and MP. However there were instances the timed out on VH but did 
not time out on MinCov and vice versa. MinCov is the second fastest tool, and 
compared to VH it is 1.2 times slower. A possible explanation would be that VH is 
implemented in C++. As could be expected, w.r.t. memory requirements MinCov 
has the least number of nodes. In the benchmarks from the literature MinCov 
has approximately 10 times less nodes then MP and 1.6 times less then VH. In the 
random benchmarks these ratio are significantly higher. 
Coverability. We compare MinCov to the tool qCover [2] on the set of bench- 
marks from the literature in Table 2. In [2], qCover is compared to the most 
competitive tools for coverability and achieves a score of 142 solved instances 
while the second best tool achieves a score of 122. We split the results into 
safe instances (not coverable) and unsafe ones (coverable). In both categories we 
counted the number of instances on which the tools failed (columns T/O) and 
the total time (columns Time) as in Table 1. 

We observed that the tools are complementary, i.e. qCover is faster at proving 
that an instance is safe and MinCov is faster at proving that an instance is unsafe. 
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Table 2. Benchmarks for the coverability problem (60 unsafe and 115 safe) 


Time Unsafe T/O Unsafe Time safe T/O safe T/O Time 


MinCov 1754 1 51323 53 54 53077 
qCover 26467 26 11865 11 37 38332 
MinCov || qCover 1841 2 13493 11 13 15334 


Therefore, by splitting the processing time between them we get better results. 
The third row of Table 2 represents a parallel execution of the tools, where the 
time for each instance is computed as follows: 


Time(MinCov || qCover) = 2 min (Time(MinCov), Time(qCover)) . 


Combining both tools is 2.5 times faster than qCover and 3.5 times faster than 
MinCov. This confirms the above statement. We could still get better results by 
dynamically deciding which ratio of CPU to share between the tools depending 
on some predicted status of the instance. 


5 Conclusion 


We have proposed a simple and efficient modification of the incomplete mini- 
mal coverability tree algorithm for building the clover of a net. Our algorithm 
is based on the introduction of the concepts of covering abstractions and accel- 
erations. Compared to the alternative algorithms previously designed, we have 
theoretically bounded the size of the additional space. Furthermore we have 
implemented a prototype which is already very competitive. 

From a theoretical point of view, we plan to study how abstractions and 
accelerations, could be defined in the more general context of well structured 
transition systems. From an experimental point of view, we will follow three 
directions in order to increase the performance of our tool. First as in [13], we 
have to select appropriate data structures to minimize the number of compar- 
isons between w-markings. Then we want to precompute a set of accelerations 
using linear programming as the correctness of the algorithm is preserved and 
the efficiency could be significantly improved. Last we want to take advantage 
of parallelism in a more general way than simultaneously running several tools. 
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Abstract This paper introduces an expressive class of quotient-induct- 
ive types, called QW-types. We show that in dependent type theory 
with uniqueness of identity proofs, even the infinitary case of QW-types 
can be encoded using the combination of inductive-inductive definitions 
involving strictly positive occurrences of Hofmann-style quotient types, 
and Abel’s size types. The latter, which provide a convenient constructive 
abstraction of what classically would be accomplished with transfinite 
ordinals, are used to prove termination of the recursive definitions of the 
elimination and computation properties of our encoding of QW-types. 
The development is formalized using the Agda theorem prover. 


Keywords: dependent type theory - higher inductive types - induct- 
ive-inductive definitions - quotient types - sized types - category theory 


1 Introduction 


One of the key features of proof assistants based on dependent type theory such 
as Agda, Coq and Lean is their support for inductive definitions of families of 
types. Homotopy Type Theory [29] introduces a potentially very useful extension 
of the notion of inductive definition, the higher inductive types (HITs). To define 
an ordinary inductive type one declares how its elements are constructed. To 
define a HIT one not only declares element constructors, but also declares 
equality constructors in identity types (possibly iterated ones), specifying how 
the constructed elements and identities are to be equated. In this paper we work 
in a dependent type theory satisfying uniqueness of identity proofs (UIP), so 
that identity types are trivial in dimensions higher than one. Nevertheless, as 
Altenkirch and Kaposi [5] point out, HITs are still useful in such a one-dimensional 
setting. They introduce the term quotient inductive type (QIT) for this truncated 
form of HIT. 

Figure 1 gives two examples of QITs, using Agda-style notation for dependent 
type theory; in particular, Set denotes a universe of types and = denotes the 
identity type. The first example specifies the element and equality constructors 
for the type Bag X of finite multisets of elements from a type X. The second 
example, adapted from [5], specifies the element and equality constructors for the 
type wTree X of trees whose nodes are labelled with elements of X and that have 
unordered countably infinite branching. Both examples illustrate the nice feature 
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Finite multisets: 
data Bag(X : Set) : Set where 
[] : Bag X 
—u_:X — Bag X — Bag X 
swap: (x y : X)(ys : Bag X) >z: yuys=ynniuys 


Unordered countably branching trees (elements of islso f witness that f is a bijection): 
data wTree(X : Set) : Set where 


leaf : wTree X 
node: X > (N > wTree X) > wTree X 


perm: (x : X)(f : N > N)(_ : islso f)(g : N > wTree X) > 
node z g = node z (go f) 


Figure 1. Two examples of QITs 


of QITs that users only have to specify the particular identifications between 
data needed for their applications. Thus the standard property of equality that it 
is an equivalence relation respecting the constructors is inherited by construction 
from the usual properties of identity types, without the need to say so in the 
declaration of the QIT. 

The second example also illustrates a more technical aspect of QITs, that they 
enable constructive versions of structures that classically use non-constructive 
choice principles. The first example in Figure 1 only involves element constructors 
of finite arity ([] is nullary and x :: _ is unary) and consequently Bag X is 
isomorphic to the type obtained from the ordinary inductive type of finite lists 
over X by quotienting by the congruence generated by swap. Of course this 
assumes, as we do in this paper, that the type theory comes with Hofmann-style 
quotient types |18, Section 3.2.6.1]. By contrast, the second example in the figure 
involves an element constructor with countably infinite arity. So if one first forms 
the ordinary inductive type of ordered countably branching trees (by dropping 
the equality constructor perm from the declaration) and then quotients by a 
suitable relation to get the equalities specified by perm, one needs the axiom of 
countable choice to be able to lift the node element constructor to the quotient; 
see [5, Section 2.2] for a detailed discussion. The construction of the Cauchy 
reals as a higher inductive-inductive type [29, Section 11.3] provides a similar, 
but more complicated example where use of countable choice is avoided. Such 
examples have led to the folklore that as far as constructive type theories go, 
infinitary QITs are more expressive than the combination of ordinary inductive (or 
inductive-recursive, or inductive-inductive) types with quotient types. In this 
paper we use Abel’s sized types [2] to show that, for a wide class of QITs, this 
view is not justified. Thus we make two main contributions: 

First we define a family of QITs called QW-types and give elimination and 
computation rules for them (Section 2). The usual W-types of Martin-Léf [22] 
are inductive types giving the algebraic terms over a possibly infinitary signature. 
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One specifies a QW-type by giving a family of equations between such terms. 
So such QITs give initial algebras for possibly infinitary algebraic theories. As 
we indicate in Section 3, they can encode a very wide range of examples of 
possibly infinitary quotient-inductive types, namely those that do not involve 
constructors taking previously constructed equalities as arguments (so do not 
cover the infinitary extension of the very general scheme considered by Dybjer 
and Moeneclaey [12]). In set theory with the Axiom of Choice (AC), QW-types 
can be constructed simply as Quotients of the underlying W-type—hence the 
name. 

Secondly, we prove that contrary to expectation, without AC it is still possible 
to construct QW-types using quotients, but not simply by quotienting a W-type. 
Instead, the type to be quotiented and the relation by which to quotient are given 
simultaneously by definitions that refer to each other. Thus our construction (in 
Section 4) involves inductive-inductive definitions [15]. The elimination and 
computation functions which witness that the quotiented type correctly represents 
the required QW-type are defined recursively. In order to prove that our recursive 
definitions terminate we combine the use of inductive definitions involving strictly 
positive occurrences of quotient types with sized types (currently, we do not know 
whether it is possible to avoid sizing in favour of, say, a suitable well-founded 
termination ordering). Sized types provide a convenient constructive abstraction 
of what classically would be accomplished with sequences of transfinite ordinal 
length. 


The type theory in which we work 


To present our results we need a version of Martin-L6f Type Theory with 
(1) uniqueness of identity proofs, (2) quotient types and hence also function ex- 
tensionality, (3) inductive-inductive datatypes (with strictly positive occurrences 
of quotient types) and (4) sized types. Lean 3 provides (1) and (2) out of the 
box, but also the Axiom of Choice, unfortunately. Neither it, nor Coq provide (3) 
and (4). Agda provides (1) via unrestricted dependent pattern-matching, (2) via 
a combination of postulates and the rewriting mechanism of Cockx and Abel 
[8], (3) via its very liberal mechanism for mutual definitions and (4) thanks to 
the work of Abel [2]. Therefore we make use of the type theory implemented by 
Agda (version 2.6.0.1) to give formal proofs of our results. The Agda code can 
be found at DOI: 10.17863/CAM.48187. In this paper we describe the results 
informally, using Agda-style notation for dependent type theory. In particular 
we use Set to denote the universe at the lowest level of a countable hierarchy of 
(Russell-style) universes. We also use Agda’s convention that an implicit argument 
of an operation can be made explicit by enclosing it in {braces}. 


Acknowledgement We would like to acknowledge the contribution Ian Orton made 
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supervised the third author’s Master’s dissertation Quotient Inductive Types: A 
Schema, Encoding and Interpretation, in which the notion of QW-type (there 
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2 QW-types 


We begin by recalling some facts about types of well-founded trees, the W-types 
of Martin-Léf [22]. We take signatures to be elements of the dependent product 


Sig = >> A: Set, (A — Set) (1) 


So a signature is given by a pair © = (A, B) consisting of a type A : Set and 
a family of types B : A > Set. Each such signature determines a polynomial 
endofunctor |1, 16] S{£} : Set + Set whose value at X : Set is the following 
dependent product 

S{U}X = Soa: A, (Ba X) (2) 


An S-algebra is by definition an element of the dependent product 
Alg{} = X X : Set, (SX > X) (3) 


S-algebra morphisms (X,s) => (X’,s’) are given by functions h : X > X’ 
together with an element of the type 


isHomh = (a: A)(b: Ba > X) > s'(a,h o b) = h(s(a,b)) (4) 


Then the W-type W{} determined by © is the underlying type of an initial 
S-algebra. More generally, Dybjer [11] shows that the initial algebra of any non- 
nested, strictly positive endofunctor on Set is given by a W-type; and Abbott, 
Altenkirch, and Ghani [1] extend this to the case with nested uses of W-types as 
part of their work on containers. (These proofs take place in extensional type 
theory [22], but work just as well in the intensional type theory with uniqueness 
of identity proofs and function extensionality that we are using here.) 

More concretely, given a signature X = (A, B), if one thinks of elements a: A 
as names of operation symbols whose (not necessarily finite) arity is given by 
the type Ba : Set, then the elements of W{X} represent the closed algebraic 
terms (i.e. well-founded trees) over the signature. From this point of view it is 
natural to consider not only closed terms solely built up from operations, but 
also open terms additionally built up with variables drawn from some type X. As 
well as allowing operators of possibly infinite arity, we also allow terms involving 
possibly infinitely many variables (the second example in Figure 1 involves such 
terms). Categorically, the type T{=}X of such open terms is the free S-algebra 
on X and is another W-type, for the signature obtained from X by adding the 
elements of X as nullary operations. Nevertheless, it is convenient to give a direct 
inductive definition: 


data: T{X : Sig}(X : Set) : Set where 
ni X > TX (5) 
a:S(TX) 39 TX 


Given an S-algebra (Y,s) : Alg{X} and a function f : X — Y, the unique 
morphism of S-algebras from the free S-algebra (T X,o0) on X to (Y,s) has 
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underlying function T X — Y mapping each t: T X to the element t >= f in Y 
defined! by recursion on the structure of t: 


nx >= f =f 2 
a(a,b) >= f = s(a, àz > ba >= f) 


(6) 


As the notation suggests, >= is the Kleisli lifting operation (“bind”) for a monad 
structure on T; indeed, it is the free monad on the endofunctor S. 

The notion of “QW-type” that we introduce in this section is obtained from 
that of W-type by considering not only the algebraic terms over a given signature, 
but also equations between terms. To code equations we use a type-theoretic 
rendering of a categorical notion of equational system introduced by Fiore and Hur, 
referred to as term equational system [14, Section 2] and as monadic equational 
system |13, Section 5], here instantiated to free monads on signatures. 


Definition 1. A system of equations over a signature X : Sig is specified by 


— a type E : Set (whose elements e : E name the equations) 

— a family of types V : E + Set (Ve: Set contains the variables used in the 
equation named e : E) 

— for each e : E, elements le and re of type T(V e), the free S-algebra on V e 
(the terms with variables from V e that are equated by the equation named e). 


Thus a system of equations over X is an element of the dependent product 
Syseq{X} = J E : Set, $L V : (E > Set), (7) 
((e: E) > T(V e)) x ((e: E) > T(V e)) 

An S{¥£ }-algebra SX —> X satisfies the system of equations € = (E, V,l,r) : 
Syseq{} if there is an element of type 

Sat{e}X = (e : E)\(p : V e => X) > ((le) >= p) = ((re) >= p) (8) 
The category-theoretic view of QW-types is that they are simply S-algebras that 
are initial among those satisfying a given system of equations: 


Definition 2. A QW-type for a signature 4 = (A,B) : Sig and system of 
equations £ = (E, V,l,r) : Syseq{E} is given by a type QW{X}{e} : Set equipped 
with an S-algebra structure and a proof that it satisfies the equations 


qwintro : S(QW) > QW (9) 
qwequ : Sat{e}(QW) (10) 
together with functions that witness that it is the initial such algebra: 
qwrec : (X : Set)(s: SX > X) > Sat X > QW > X (11) 
qwrechom : (X : Set)(s: S X — X)(p: Sat X) — isHom(qwrec X sp) (12) 
qwuniq : (X : Set)(s : S X > X)(p: Sat X)(f : QW > X) > (13) 


isHom f > qwrec X sp = f 
1 Note that the definition of = depends on the S-algebra structure s; in Agda we use 
instance arguments to hide this dependence. 
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Given the definitions of S{X} in (2) and Sat{e} in (8), properties (9) and (10) 
suggest that a QW-type is an instance of the notion of quotient-inductive type [5] 
with element constructor qwintro and equality constructor qwequ. For this to 
be so, QW{X}{e} needs to have the requisite dependently-typed elimination 
and computation? properties for these element and equality constructors. As 
Proposition 1 below shows, these follow from (11)—(13), because we are working 
in a type theory with function extensionality (by virtue of assuming quotient 
types). To state the proposition we need a dependent version of (6). For each 


P : QW => Set 


pilata Bas OW) (Gs Bal > PO2) Paa 


type X : Set, function f : X > Soa : QW, Px and term t : T(X), we get an 
element lift Pp ft: P(t{>>= fst o f) defined by recursion on the structure of t: 


lifttP pf (nz) =snd(f x) 


lift P p f (o(a,b)) = pa (Ax > ba >= (fst o f))(lift P p f o b) (e 


Proposition 1. For a QW-type as in the above definition, given P and p as in 
(14) and a term of type 


(e: E\(f : Ve Sox: QW, Px) > lift Pp f (le) == lift Pp f (re) (16) 
there are elimination and computation terms: 


qwelim : (x : QW) > Px 
qwcomp : (a: A)(b: Ba —> QW) —> quwelim(qwintro(a, b)) = pa b (qwelim o b) 
(Note that (16) uses McBride’s heterogeneous equality type [23], which we denote 


by ==, because lift Pp f (le) and lift Pp f (re) inhabit different types, namely 
P(le = fst o f) and P(re >= fst o f) respectively.) 


The proof of the proposition can be found in the accompanying Agda code 
(Dor: 10.17863/CAM.48187). 

So QW-types are in particular quotient-inductive types (QITs). Conversely, in 
the next section we show that a wide range of QITs can be encoded as QW-types. 
Then in Section 4 we prove: 


Theorem 1. In constructive dependent type theory with uniqueness of identity 
proofs (or equivalently the Axiom K of Streicher [27]) and universes with induct- 
ive-inductive datatypes [15] permitting strictly positive occurrences of quotient 
types [18] and sized types [2], for every signature and system of equations (Defin- 
ition 1) there is a QW-type as in Definition 2. 


? We only establish the computation property up to propositional rather than defini- 
tional equality; so, using the terminology of Shulman [25], these are typal quotient-in- 
ductive types. 
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Remark 1 (Free algebras). Definition 2 defines QW-types as initial algebras. A 
corollary of Theorem 1 is that free-algebras also exist. In other words, given a 
signature X and a type X : Set, there is an S-algebra 


(FLU S {ef X , SLUF(F{U}{e}X) > FLU} fe} X) 


satisfying a system of equations € and equipped with a function X > F{U}{ehX, 
and which is universal among such S-algebras. Thus QW{X}{e} is isomorphic to 
F{ }{e}@, where Ø is the empty datatype. 

To see that such free algebras can be constructed as QW-types, given a 
signature X = (A, B), let Ux be the signature (X W A, B’), where X w A is the 
coproduct datatype (with constructors inl : X + X w A and inr : A > X w A) 
and where B’ : X W A —> Set maps each inl x to Ø and each inra to Ba. Given 
a system of equations ¢ = (E, V,l,r), let ex be the system (E, V,lx,rx) where 
for each e : E, lx e = le >= n and ry e = re >= n (using 7: Ve > T{Ux}(Ve) 
as in (5) and the S{¥ }-algebra structure s on T{Xx}(V e) given by s(a,b) = 
a(inra,b)). Then one can show that the QW-type QW{Xx}{ex} is the free 
algebra F{U}{e}X, with the function X —> F{U}{e}X sending each x : X to 
qwintro(inlz, _): QW{Xx}{ex}, and the S{H}-algebra structure on F{U}{e}xX 
being given by the function sending (a,b) : S(QW{Xx Hex }) to qwintro(inra, b). 


Remark 2 (Strictly positive equational systems). A very general, categorical 
notion of equational system was introduced by Fiore and Hur [14, Section 3]. 
They regard any endofunctor S$ : Set — Set as a functorial signature. A functorial 
term over such a signature, S œ G F L, is specified by another functorial signature 
G : Set — Set (the term’s context) together with a functor L from S-algebras to 
G-algebras that commutes with the forgetful functors to Set. Then an equational 
system is given by a pair of such terms in the same context, S > Gt L and 
S >œ GF R say. An S-algebra s : SX — X satisfies the equational system if 
L(X,s) and R(X, s) are equal G-algebras. 

Taking the strictly positive endofunctors Set — Set to be the smallest collec- 
tion containing the identity and constant endofunctors and closed under forming 
dependent products and dependent functions over fixed types then, as in [11] 
(and also in the type theory in which we work), up to isomorphism every such 
endofunctor is of the form S{X} for some signature X : Sig. If we restrict atten- 
tion to equational systems S >œ GFE L,R with S and G strictly positive, then 
it turns out that such equational systems are in bijection with the systems of 
equations from Definition 1, and the two notions of satisfaction for an algebra 
coincide in that case. (See our Agda development for a proof of this.) So Dybjer’s 
characterisation of W-types as initial algebras for strictly positive endofunctors 
generalises to the fact that QW-types are initial among the algebras satisfying 
strictly positive equational systems in the sense of Fiore and Hur. 


3 Quotient-inductive types 


Higher inductive types (HITs) are originally motivated by their use in homotopy 
type theory to construct homotopical cell complexes, such as spheres, tori, and 
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so on [29]. Intuitively, a higher inductive type is an inductive type with point 
constructors also allowing for path constructors, surface constructors, etc., which 
are represented as elements of (iterated) identity types. For example, the sphere 
is given by the HIT?: 


data S? : Set where 
base : S? (17) 


surf : refl =base=,2 base refl 


In the presence of the UIP axiom we will refer to HITs as quotient inductive 
types (QITs) [5], since all paths beyond the first level are trivial and any HIT 
is truncated to an h-set. We use the terms element constructor and equality 
constructor to refer to the point constructors and the only non-trivial level of 
path constructors. 

We believe that QW-types can be used to encode a wide range of QITs: see 
Conjecture 1 below. As evidence, we give several examples of QITs encoded 
as QW-types, beginning with the two examples of QITs in Figure 1, giving 
the corresponding signature (A,B) and system of equations (£,V,l,7r) as in 
Definition 2. 


Example 1 (Finite multisets). The element constructors for finite multisets are 
encoded exactly as with a W-type: the constructors are [] and æ :: _ for each 
x: X. So we take A to be 1 X, the coproduct of the unit type 1 (whose single 
constructor is denoted tt) with X. The arity of [] is zero, and the arity of each 
x:: __ is one, represented by the empty type Ø and unit type 1 respectively; so we 
take B : A — Set to be the function [A_—> 0 | A_— 1]: 1W X — Set mapping 
inltt to Ø and each inrz to 1. 

The swap equality constructor is parameterised by elements of E = X x X. 
For each (x,y) : E, swapxy yields an equation involving a single free vari- 
able (called ys : Bag X in Figure 1); so we take V : E — Set to be AX_ > 1. 
Each side of the equation named by swapzy is coded by an element of 
T{U}(V(z,y)) = T{U}(1). Recalling the definition of T from (5), the single 
free variable corresponds to 7tt : T{X}(1) and then the left-hand side of 
the equation is o(inra,(A_—o(inry, (A_—tt)))) and the right-hand side is 
a(inry, (A_— a(inr x, (A_— ntt)))). 

So, altogether, the signature and system of equations for the QW-type corres- 
ponding to the first example in Figure 1 is: 


A=1WX E=XxX 
B=[\_>ø]|à_>1] V=\_>1 
l= A (x,y) > o(inra, (A_ > o(inry, (A_—> ntt)))) 
r= à (x,y) > alinry, (A_—o(inra, (A_— ntt)))) 


3 The subscript on = will be treated as an implicit argument and omitted when clear. 
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Example 2 (Unordered countably-branching trees). Here the element constructors 
are leaf of arity zero and, for each x : X, node x of arity IN. So we use the signature 
with A=1WX and B=[A_7 @|A_> N). 

The perm equality constructor is parameterised by elements of 


E=Xx)_f: (N > N), islso f 


For each element (x, f,i) of that type, perm fi yields an equation involving 
an N-indexed family of variables (called g : N + wTree X in Figure 1); so we 
take V : E — Set to be A_—> N. Each side of the equation named by permz f i 
is coded by an element of T4} (V (x, f,i)) = T{E} (N). The N-indexed family 
of variables is represented by the function 7 : N —> T{X} (N) and its permuted 
version by ņo f. Thus the left- and right-hand sides of the equation named by 
perm x f i are coded respectively by the elements o (inr x, n) and o(inr 2,10 f) of 
T{Z}(N). 

So, altogether, the signature and system of equations for the QW-type corres- 
ponding to the second example in Figure 1 is: 


A=1WX E=xX x f: (N> N), islso f 
B=[\_> ø| _ >N] V=A_>N 

l=A\A(xz,_,_)—> o(inrz,n) 

r= A(x, f,_)—> alinrz,ņo f) 


That unordered countably-branching trees are a QW-type is significant since no 
previous work on various subclasses of QITs (or indeed QIITs [19, 10]) supports 
infinitary QITs [6, 26, 28, 12, 19, 10]. See Example 5 for another, more substantial 
infinitary QW-type. So this extension represents one of our main contributions. 
QW-types generalise prior developments; the internal encodings for particular 
subclasses of 1-HITs given by Sojakova [26] and Swan [28] are direct instances of 
QW-types, as the next two examples show. 


Example 3. W-suspensions [26] are an instance of QW-types. The data for 
a W-suspension is: A’,C’ : Set, a type family B’ : A’ — Set and functions 
l'r’: C” + A’. The equivalent QW-type is: 


A=A' E=0' l=\c>o((l c)n) 
B = B' V = àc > (B' (l c)) x (B' (r'c)) r= àc > o((r c)n) 


Example 4. The non-indexed case of W-types with reductions [28] are QW-types. 
The data of such a type is: Y : Set, X : Y — Set and a reindexing map 
R:(y:Y)— Xy. The reindexing map identifies a term o (y,a) with some 
a (R y) used to construct it. The equivalent QW-type is given by: 


A=Y E=Y l= ày > o (y,n) 
B=X V=X r= y> 7 (Ri) 
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Example 5. Lumsdaine and Shulman [21, Section 9] give an example of a HIT 
not constructible in type theory from only pushouts and N. Their HIT F can 
be thought of as a set of notations for countable ordinals. It consists of three 
point constructors: 0: F, S : F — F, and sup: (N > F) > F, and five path 
constructors which are omitted here for brevity. It is inspired by the infinitary 
algebraic theory of Blass |7, Section 9] and hence it is not surprising that it can 
be encoded by a QW-type; the details can be found in our Agda code. 


3.1 General QIT schemas 


Basold, Geuvers, and van der Weide [6] present a schema (though not a model) 
for infinitary QITs that do not support conditional path equations. Constructors 
are defined by arbitrary polynomial endofunctors built up using (non-dependent) 
products and sums, which means in particular that parameters and arguments 
can occur in any order. They require constructors to be in uncurried form. 

Dybjer and Moeneclaey [12, Sections 3.1 and 3.2] present a schema for finitary 
QITs that supports conditional path equations, where constructors are allowed 
to take inductive arguments not just of the datatype being declared, but also 
of its identity type. This schema can be generalised to infinitary QITs with 
conditional path equations. We believe this extension of their schema to be the 
most general schema for QITs. The schema requires all parameters to appear 
before all arguments, whereas the schema for regular inductive types in Agda is 
more flexible, allowing parameters and arguments in any order. 

We wish to combine the schema for infinitary QITs of Basold, Geuvers, and 
van der Weide [6] with the schema for QITs with conditional path equations of 
Dybjer and Moeneclaey [12] to provide a general schema. Moreover, we would 
like to combine the arbitrarily ordered parameters and arguments of the former 
with the curried constructors of the latter in order to support flexible pattern 
matching. 

For consistency with the definition of inductive types in Agda [9, equation (25) 
and figure 1] we will define strictly positive (i.e. polynomial) endofunctors in 
terms of strictly positive telescopes. 

A telescope is given by the grammar: 


A n= é empty telescope (18) 
| (c: AJA (x ¢dom(A)) non-empty telescope 

A telescope extension (x : A)A binds (free) occurrences of x inside the tail A. 

The type A may contain free variables that are later bound by further telescope 

extensions on the left. A telescope can also exist in a context which binds any 

free variables not already bound in the telescope. Such a context is implicit in 

the following definitions. A function type A > C from a telescope A to a type C 


is defined as an iterated dependent function type by: 


def 


e> C=C 


(19) 
(x: AJA > C= (x: A) > (AC) 
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A strictly positive endofunctor on a variable Y is presented by a strictly positive 
telescope 


A = (x1 : ®1(Y)) (ro : P(Y )) --- (an : On(V))e (20) 


where each type scheme ®; is described by a expression on Y made up of I-types, 
b-types, and any (previously defined “constant”) types A not containing Y, 
according to the grammar: 


OY) UY) s= (y:A)> PY) | Up: @(Y),W(Y) | A | Y (21) 
For example, A = (x: X)(f : N —> Y)e is the strictly positive telescope for the 
node constructor in Figure 1. In this instance, reordering x and f is permitted by 
exchange. Note that the variable Y can never appear in the argument position of 
a Il-type. 

Now it is possible to define the form of the endpoints of an equality (within 
the context of a strictly positive telescope), corresponding to the notion of an 
abstract syntax tree with free variables. With this intuition in mind, we can take 
the definition in Dybjer and Moeneclaey’s presentation [12] of endpoints given 
by point constructor patterns: 


Lrps= Gk | y (22) 


Where y : Y is in the context of the telescope for the equality constructor, and k 
is aterm built without any rule for Y, but which may use other point constructor 
patterns p: Y. (That is, any sub-term of type Y must either be a variable y : Y 
found in the telescope, or a constructor for Y applied to further point constructor 
patterns and earlier defined constants. It could not, for instance, use the function 
application rule for Y with some function g : M — Y, not least since such 
functions cannot be defined before defining Y.) Note that this exactly matches 
the type T in (5). 

Basold, Geuvers, and van der Weide’s presentation has a sightly more general 
notion of constructor term [6, Definition 6] (Dybjer and Moeneclaey’s presentation 
[12] has more restricted telescopes). It is defined by rules which operate in the 
context of a strictly positive (polynomial) telescope and permit use of its bound 
variables, and the use of constructors c;, but not any other rules for Y. We take 
the dependent form of their rules for products and functions. Note that these 
rules do not allow the use of terms of type =y in the endpoints. 

As with inductive types, the element constructors of QITs are specified by 
strictly positive telescopes. The equality constructors also permit conditions 
to appear in strictly positive positions, where l and r are constructor terms 
according to grammar (22): 


P(Y), U(Y) := (same grammar as in (21)) |l=y r (23) 
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Definition 3. A QIT is defined by a list of named element constructors and 
equality constructors: 


data Y : Set where 
Cy: Ai >Y 


Cn : An >Y 
pi: 08, > h =y ri 


Pm : Om => lm =y lm 


where A; are strictly positive telescopes on Y according to (21), and ©, are 
strictly positive telescopes on Y and =y in which conditions may also occur in 
strictly positive positions according to (23). 


QITs without equality constructors are inductive types. If none of the equality 
constructors contain Y in an argument position then it is called non-recursive, 
otherwise it is called recursive [6]. If none of the equality constructors contain an 
equality in Y then we call it a non-conditional, or equational, QIT, otherwise it is 
called a conditional [12], or quasi-equational, QIT. If all of the constant types A in 
any of the constructors are finite (isomorphic to Fin n for n : N) then it is called 
a finitary QIT [12]. Otherwise, it is called a generalised [12], or infinitary, QIT. 
We are not aware of any existing examples in the literature of HITs which allow 
the point constructors to be conditional (though it is not difficult to imagine), 
nor any schemes for HITs that allow such definitions. However, we do believe 
this is worth investigating further. 


Conjecture 1. Any equational QIT can be encoded as a QW-type. 


We believe this can be proved analogously to the approach of Dybjer [11] for 
inductive types, though the endpoints still need to be considered and we have 
not yet translated the schema in definition 3 into Agda. 


Remark 3. Assuming Conjecture 1, Basold, Geuvers, and van der Weide’s schema 
[6], being an equational (non-conditional) instance of Definition 3, can be encoded 
as a QW-type. 


4 Construction of QW-types 


In Section 2 we defined a QW-type to be initial among algebras over a given 
(possibly infinitary) signature satisfying a given systems of equations (Definition 2). 
If one interprets these notions in classical Zermelo-Fraenkel set theory with the 
axiom of Choice (ZFC), one regains the usual notion from universal algebra 
of initial algebras for infinitary equational theories. Since in the set-theoretic 
interpretation there is an upper bound on the cardinality of arities of operators 
in a given signature X, the ordinal-indexed sequence $°(@) of iterations of the 
functor in (2) starting from the empty set eventually becomes stationary; and 
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so the sequence has a small colimit, namely the set W{X} of well-founded trees 
over X. A system of equations £ (Definition 1) over © generates a “-congruence 
relation ~ on W{d}. The quotient set W{X}/~ yields the desired initial algebra 
for (,¢€) provided the S-algebra structure on W{} induces one on the quotient 
set. It does so, because for each operator, using AC one can pick representatives 
of the (possibly infinitely many) equivalence classes that are the arguments of 
the operator, apply the interpretation of the operator in W{} and then take 
the equivalence class of that. So the set-theoretic model of type theory in ZFC 
models QW-types. 

Is this use of choice really necessary? Blass [7, Section 9] shows that if one 
drops AC and just works in ZF, then provided a certain large cardinal axiom is 
consistent with ZFC, it is consistent with ZF that there is an infinitary equational 
theory with no initial algebra. He shows this by first exhibiting a countably 
presented equational theory whose initial algebra has to be an uncountable 
regular cardinal; and secondly appealing to the construction of Gitik [17] of a 
model of ZF with no uncountable regular cardinals (assuming a certain large 
cardinal axiom). Lumsdaine and Shulman [21] turn the infinitary equational 
theory of Blass into a higher-inductive type that cannot be proved to exist in 
ZF (and hence cannot be constructed in type theory just using pushouts and the 
natural numbers). We noted in Example 5 that this higher inductive type can be 
presented as a QW-type. 

So one cannot hope to construct QW-types using a type theory which is 
interpretable in just ZF. However, the type theory in which we work, with its 
universes closed under inductive-inductive definitions, already requires going 
beyond ZF to be able to give it a naive, classical set-theoretic interpretation (by 
assuming the existence of enough strongly inaccessible cardinals, for example). So 
the above considerations about initial algebras for infinitary equational theories 
in classical set theory do not rule out the construction of QW-types in the type 
theory in which we work. However, something more than just quotienting a 
W-type is needed in order to prove Theorem 1. 

Figure 2 gives a first attempt to do this (which later we will modify using sized 
types to get around a termination problem). The definition is relative to a given 
signature X : Sig and system of equations € = (E, V,l,r) : Syseq X. It makes use 
of quotient types, which we add to Agda via postulates, as shown in Figure 3.4 
The REWRITE pragma makes elim R B f e (mk Rx) definitionally equal to fx 
and is not merely a computational convenience—this is what allows function 
extensionality to be proved from these postulated quotient types. The POLARITY 
pragmas enable the postulated quotients to be used in datatype declarations 
at positions that Adga deems to be strictly positive; a case in point being the 
definitions of Qo and Q, in Figure 2. Agda’s test for strict positivity is sound 
with respect to a set-theoretic semantics of inductively defined datatypes that 
are built up using strictly positive uses of dependent functions; the semantics of 
such datatypes uses initial algebras for endofunctors possessing a rank. Here we 


4 The actual implementation is polymorphic in universe levels, but for simplicity here 
we just give the level-zero version. 
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mutual 
data Qo : Set where 
sq: TQ > Qo 


data Qi : Qo + Qo — Set where 
sqeq : (e : E)(p: V e > Q) > Qı (sq(T'p (Le))) (sq(T'p (re))) 
sqn : (x : Qo) + Qı (sq(n(qux))) x 
sqo : (s : S(T Q)) => Qi (sq(0 5)) (sa((S'(qu o sq) s))) 

Q : Set 

Q = Qo/Qi 


qu : Qo > Q 
qu = quot.mk Qı 


QW{E}{e} =Q 


Figure 2. First attempt at constructing QW-types 


are allowing the inductively defined datatypes to be built up using quotients as 
well, but this is semantically unproblematic, since quotienting does not increase 
rank. (Later we need to combine the use of POLARITY with sized types; the 
semantics of this has been studied for System F., [3], but needs to be explored 
further for Agda.) 

We build up the underlying inductive type Qo to be quotiented using a 
constructor sq that takes well-founded trees T(Qo/Q1) of whole equivalence 
classes with respect to a relation Q; that is mutually inductively defined with 
Qo—an instance of an inductive-inductive definition [15]. The definition of Qı 
makes use of the actions on functions of the signature endofunctor S and its 
associated free monad T (Section 2); those actions are defined as follows: 


S': {XY : Set} > 


— 


X>Y)>SX>SY 


S' f (a,b) = (a, f o b) (24) 
T':{XY :Set}—>(X>Y)>TX>TY (25) 
Tft=t>=(nof) 


The definition of Qı also uses the natural transformation .: {X : Set} > SX > 
T X defined by 1 = 0 0S'n. 

Turning to the proof of Theorem 1 using the definitions in Figure 2, the 
S-algebra structure (9) is easy to define without using any form of choice, because 
of the type of Qo’s constructor sq. Indeed, we can just take qwintro to be 
quosqou: S(QW) > QW.* The first constructor sqeq of the data type Qı ensures 
that the quotient Qo/Q; satisfies the equations in £, so that we get qwequ as 
in (10); and the other two constructors, sq7) and sqo make identifications that 


5 The use of the free monad T{X} in the domain of sq, rather than just S{}, seems 
necessary in order to define Qı with the properties needed for (10)—(13). 
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module quot where 
postulate 

ty: {A : Set}(R: A —> A —> Set) > Set 

mk : {A : Set }(R : A > A > Set) > A > ty R 

eq: {A : Set }(R : A > A > Set){xy: A} > Rayo mkRr=mkRy 

elim: {A : Set} (R : A —> A > Set) (B : ty R > Set) (f : (x : A) > B(mk Rz)) 

(e: {xy: A} —> Rry—> fr== fy)(z:tyR)> Bz 
comp: {A : Set} (R : A> A > Set) (B : ty R > Set) (f : (a: A) > B(mk Ra)) 
(e: {xy: A} > Rzy > fr== fy)(z: A) > elim RB fe(mk Rz)= fzx 

{-# REWRITE comp -#} 
{-# POLARITY ty ++ ++ -#} 
{-# POLARITY mk _ _ * -#} 


_/_:(A:Set)(R: A —> A —> Set) > Set 
A/R = quot.ty R 


Figure 3. Quotient types 


enable the construction of functions qwrec, qwrechom and qwuniq as in (11)—(13). 
However, there is a problem. Given X : Set, s : SX — X and e : Sat X, for 
qwrec X se we have to construct a function r : Q + X. Since Q = Qo/Q; is a 
quotient, we will have to use the eliminator quot.elim from Figure 3 to define r. 
The following is an obvious candidate definition 


mutual (26) 
r:Q> X 
r = quot.elim Q1 (A_ > X) rori 


ro : Qo > X 
ro(sqt) = t >=r 


rı : {x y: Qo} > Qıry > ro r=roy 
r === 


(where we have elided the details of the invariance proof r1). The problem with 
this mutually recursive definition is that it is not clear to us (and certainly not 
to Agda) whether it gives totally defined functions: although the value of rp at a 
typical element sq t is explained in terms of the structurally smaller element t, the 
explanation involves r, whose definition uses the whole function rp rather than 
some application of it at a structurally smaller argument. Agda’s termination 
checker rejects the definition. 

We get around this problem by using a type-based termination method, 
namely Agda’s implementation of sized types [2]. Intuitively, this provides a type 
Size of “sizes” which give a constructive abstraction of features of ordinals in ZF 
when they are used to index sequences of sets that eventually become stationary, 
such as in various transfinite constructions of free algebras |20, 14]. In Agda, 
the type Size comes equipped with various relations and functions: given sizes 
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mutual 
data Qo(i : Size) : Set where 
sq: {j : Size< i} > T(Q j) > Qoi 
data Qı (i : Size) : Qo i 4 Qo i — Set where 
sqeq : {j : Size< i}(e : E)(p : V e —> Qj) > Qı i (sqa(T'p (le))) (sq(T'p(re))) 
sqn : {j : Size< i} (x : Qo j) + Q1 i (sa(n(quj x))) (po ix) 
sqa : {j : Size<i}{k : Size< j}(s : S(T(Q k))) > 
Q1 i (sq(ø 5)) (sq(u(S'(qu j o sq) s))) 
Q : Size — Set 
Qi= (Qoi)/Qi 


qu : (i : Size) > Qoi > Qi 
qui = quot.mk (Q1 i) 


do: (i : Size) {7 : Size< i} + Qo 7 > Qoi 
got (sqz) = sq z 


QW{E He} = Qœ 


Figure 4. Construction of QW-types using sized types 


i,j : Size, there is a relation 7 : Size< j to indicate strictly increasing size (so 
the type Size< j is treated as a subtype of Size); there is a successor operation 
+ : Size — Size (and also a join operation _LI*__ : Size — Size — Size, but we 
do not need it here); and a size oo : Size to indicate where a sequence becomes 
stationary. Thus we construct the QW-type QW{£ He} as Qoo for a suitable 
size-indexed sequence of types Q : Size — Set, shown in Figure 4. 

For each size 7: Size, the type Qi is a quotient Qo i/Q1 i, where the construct- 
ors of the data types Qoi and Q; į take arguments of smaller sizes j : Size< i. 
Consequently in the following sized version of (26) 


mutual (27) 
r:{i:Size} > Qi > X 
r{i} = quot.elim (Q1 i) (A_ —> X) (ro {i}) (rı {i}) 
ro : {i : Size} 4 Qo i > X 
ro{i}(sq {j} t) = t >= r {4} 


rı : {i : Size} {a y : Qoi} > Qiixzy > ror=roy 


ri = 


the definition of ro{i} involves a recursive call via r to the whole function ro, but 
at a size j which is smaller than 7. So now Agda accepts that the definition of 
qwrec X se as roo, with r as in (27), is terminating. 

Thus we get a function qwrec for (11). We still have (9), but now with 
qwintro = quooosq {co} ov; and as before, the constructor sqeq of Q; in Figure 4 
ensures that QW = (Qo co) /Q; œ satisfies the equations £. With these definitions 
it turns out that each qwrec X se is an S-algebra morphism up to definitional 
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equality, so that the function qwrechom needed for (12) is straightforward to 
define. Finally, the function qwuniq needed for (13) can be constructed via a 
sequence of lemmas making use of the other two constructors of the data type 
Qı, namely sqn, which makes use of an auxiliary function for coercing between 
different size instances of Qo, and sqo. We refer the reader to the accompanying 
Agda code (DOI: 10.17863/CAM.48187) for the details of the construction of 
qwuniq. Altogether, the sized definitions in Figure 4 allow us to complete a proof 
of Theorem 1. 


5 Conclusion 


QW-types are a general form of QIT that capture many examples, including simple 
1-cell complexes and non-recursive QITs [6], non-structural QITs [26], W-types 
with reductions [28], and also infinitary QITs (e.g. unordered infinitely branching 
trees [5], and ordinals [21]). They also capture the notion of initial (and free) 
algebras for strictly positive equational systems [14], analogously to how W-types 
capture the notion of initial (and free) algebras for strictly positive endofunctors 
(see Remark 2). Using Agda to formalise our results, we have shown that it 
is possible to construct any QW-type, even infinitary ones, in intensional type 
theory satisfying UIP, using inductive-inductive definitions permitting strictly 
positive occurrences of quotients and sized types (see Theorem 1 and Section 4). 
We conclude by mentioning related work and some possible directions for future 
work. 


Quotients of monads. In view of Remark 2, Section 4 gives a construction of 
initial algebras for equational systems [14] on the free monad T{X} generated by 
a signature ©. By a suitable change of signature (see Remark 1) this extends to 
a construction of free algebras, rather than just initial ones. We can show that 
the construction works for an arbitrary strictly positive monad and not just for 
free ones. Given such a construction one gets a quotient monad morphism from 
the base monad to the quotient monad. This contravariantly induces a forgetful 
functor from the algebras of the latter to that of the former. Using the adjoint 
triangle theorem, one should be able to construct a left adjoint. This would then 
cover examples such as the free group over a monoid, free ring over a group, etc. 


Quotient inductive-inductive types. The notion of QW-type generalises to indexed 
QW-types, analogously to the generalisation of W-types to Petersson-Synek trees 
for inductively defined indexed families of types [24, Chapter 16], and we will 
consider it in subsequent work. More generally, we wonder whether our analysis 
of QITs using quotients, inductive-inductive and sized types can be extended to 
cover the notion of quotient inductive-inductive type (QUT) [4, 19]. Dijkstra [10] 
studies such types in depth and in Chapter 6 of his thesis gives a construction 
for finitary ones in terms of countable colimits, and hence in terms of countable 
coproducts and quotients. One could hope to pass to the infinitary case by using 
sized types as we have done, provided an analogue for QIITs can be found of 
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the monadic construction in Section 4 for our class of QITs, the QW-types. 
Kaposi, Kovacs, and Altenkirch [19] give a specification of finitary QUTs using a 
domain-specific type theory called the theory of signatures and prove existence of 
QIITs matching this specification. It might be possible to encode their theory of 
signatures using QW-types (it can already be encoded as a QIIT), or to extend 
QW-types making this possible. This would allow infinitary QIITs. 


Schemas for QITs. We have shown by example that QW-types can encode a wide 
range of QITs. However, we have yet to extend this to a proof of Conjecture 1 
that every instance of the schema for QITs considered in Section 3 can be so 
encoded. 


Conditional path equations. In Section 3 we mentioned the fact that Dybjer and 
Moeneclaey |12] give a model for finitary 1-HITs and 2-HITs in which constructors 
are allowed to take arguments involving the identity type of the datatype being 
declared. On the face of it, QW-types are not able to encode such conditional 
QITs. We plan to consider whether it is possible to extend the notion of QW-type 
to allow encoding of infinitary QITs with such conditional equations. 


Homotopy Type Theory (HoTT). Our development makes use of UIP (and het- 
erogeneous equality), which is well-known to be incompatible with the Univalence 
Axiom [29, Example 3.1.9]. Given the interest in HoTT, it is certainly worth 
investigating whether a result like Theorem 1 holds in univalent foundations for a 
suitably coherent version of QW-types. We are currently investigating this using 
set-truncation. 


Pattern matching for QITs and HITs. Our reduction of QITs to induction- 
induction, strictly positive quotients and sized types is of theoretical interest, but 
in practice one could wish for more direct support in systems like Agda, Lean and 
Coq for the very useful notion of quotient inductive types (or more generally, for 
higher inductive types). Even having better support for the special case of quotient 
types would be welcome. It is not hard to envisage the addition of a general schema 
for declaring QITs; but when it comes to defining functions on them, having 
to do that with eliminator forms rapidly becomes cumbersome (for example, 
for functions of several QIT arguments). Some extension of dependently typed 
pattern matching to cover equality constructors as well as element constructors 
is needed and the third author has begun work on that based on the approach of 
Cockx and Abel [9].° 


6 In this context it is worth mentioning that the cubical features of recent versions 
of Agda give access to cubical type theory [30]. This allows for easy declaration of 
HITs and hence in particular QITs (and quotients avoiding the need for POLARITY 
pragmas) and a certain amount of pattern matching when it comes to defining 
functions on them: the value of a function on a path constructor can be specified by 
using generic elements of the interval type in point-level patterns; but currently the 
user is given little mechanised assistance to solve the definitional equality constraints 
on end-points of paths that are generated by this method. 
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Abstract. The glueing construction, defined as a certain comma cate- 
gory, is an important tool for reasoning about type theories, logics, and 
programming languages. Here we extend the construction to accommo- 
date ‘2-dimensional theories’ of types, terms between types, and rewrites 
between terms. Taking bicategories as the semantic framework for such 
systems, we define the glueing bicategory and establish a bicategorical 
version of the well-known construction of cartesian closed structure on 
a glueing category. As an application, we show that free finite-product 
bicategories are fully complete relative to free cartesian closed bicate- 
gories, thereby establishing that the higher-order equational theory of 
rewriting in the simply-typed lambda calculus is a conservative extension 
of the algebraic equational theory of rewriting in the fragment with finite 
products only. 


Keywords: glueing, bicategories, cartesian closure, relative full com- 
pleteness, rewriting, type theory, conservative extension 


1 Introduction 


Relative full completeness for cartesian closed structure. Every small 
category C can be viewed as an algebraic theory. This has sorts the objects of 
C with unary operators for each morphism of C and equations determined by 
the equalities in C. Suppose one freely extends C with finite products. Categori- 
cally, one obtains the free cartesian category F*[C] on C. From the well-known 
construction of F*[C] (see e.g. [12] and [46, §8]) it is direct that the universal 
functor C — F* [C] is fully-faithful, a property we will refer to as the relative full 
completeness (c.f. [2,16]) of C in F*[C]. Type theoretically, F*[C] corresponds 
to the Simply-Typed Product Calculus (STPC) over the algebraic theory of C, 
given by taking the fragment of the Simply-Typed Lambda Calculus (STLC) 
consisting of just the types, rules, and equational theory for products. Relative 
full completeness corresponds to the STPC being a conservative extension. 
Consider now the free cartesian closed category F **~ [C] on C, type-theoretically 
corresponding to the STLC over the algebraic theory of C. Does the relative full 
completeness property, and hence conservativity, still hold for either C in F*%:? [C] 
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or for F* [C] in F*:*[C]? Precisely, is either the universal functor C > F% [C] 
or its universal cartesian extension F*[C] — F% [C] full and faithful? The 
answer is affirmative, but the proof is non-trivial. One must either reason proof- 
theoretically (e.g. in the style of [63, Chapter 8]) or employ semantic techniques 
such as glueing [39, Annexe C]. 

In this paper we consider the question of relative full completeness in the 
bicategorical setting. This corresponds to the question of conservativity for 
2-dimensional theories of types, terms between types, and rewrites between 
terms (see [32,20]). We focus on the particular case of the STLC with invertible 
rewrites given by -reductions and 7-expansions, and its STPC fragment. By 
identifying these two systems with cartesian closed, resp. finite product, structure 
‘up to isomorphism’ one recovers a conservative extension result for rewrites akin 
to that for terms. 


2-dimensional categories and rewriting. It has been known since the 
1980s that one may consider 2-dimensional categories as abstract reduction sys- 
tems (e.g. [54,51]): if sorts are 0-cells (objects) and terms are 1-cells (morphisms), 
then rewrites between terms ought to be 2-cells. Indeed, every sesquicategory 
(of which 2-categories are a special class) generates a rewriting relation ~> on its 
1-cells defined by f ~~ g if and only if there exists a 2-cell f > g (e.g. [60,58]). 
Invertible 2-cells may be then thought of as equality witnesses. 

The rewriting rules of the STLC arise naturally in this framework: Seely [56] 
observed that G-reduction and 7-expansion may be respectively interpreted as 
the counit and unit of the adjunctions corresponding to lax (directed) products 
and exponentials in a 2-category (c.f. also [34,27]). This approach was taken up by 
Hilken [32], who developed a ‘2-dimensional \-calculus’ with strict products and 
lax exponentials to study the proof theory of rewriting in the STLC (c.f. also [33]). 

Our concern here is with equational theories of rewriting, and we follow Seely 
in viewing weak categorical structure as a semantic model of rewriting modulo an 
equational theory. We are not aware of non-syntactic examples of 2-dimensional 
cartesian closed structure that are lax but not pseudo (i.e. up to isomorphism) 
and so adopt cartesian closed bicategories as our semantic framework. 

From the perspective of rewriting, a sesquicategory embodies the rewriting of 
terms modulo the monoid laws for identities and composition, while a bicategory 
embodies the rewriting of terms modulo the equational theory on rewrites given 
by the triangle and pentagon laws of a monoidal category. Cartesian closed 
bicategories further embody the usual 6-reductions and 7-expansions of STLC 
modulo an equational theory on rewrites; for instance, this identifies the composite 
rewrite (t1, t2) => (m7 ((ti, t2)), T2((t1, t2))) > (41, t2) with the identity rewrite. 
Indeed, in the free cartesian closed bicategory over a signature of base types 
and constant terms, the quotient of 1-cells by the isomorphism relation provided 
by 2-cells is in bijection with afn-equivalence classes of STLC-terms (c.f. [55, 
Chapter 5J). 


Bicategorical relative full completeness. The bicategorical notion of relative 
full completeness arises by generalising from functors that are fully-faithful to 
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pseudofunctors F : B —> C that are locally an equivalence, that is, for which 
every hom-functor Fyx,y : B(X,Y) > C(FX, FY) is an equivalence of categories. 
Interpreted in the context of rewriting, this amounts to the conservativity of 
rewriting theories. First, the equational theory of rewriting in C is conservative 
over that in B: the hom-functors do not identify distinct rewrites. Second, the 
reduction relation in C(FX, FY) is conservative over that in B(X, Y): whenever 
Ff ~ Fg in C then already f ~~ g in B. Third, the term structure in B gets 
copied by F in C: modulo the equational theory of rewrites, there are no new 
terms between types in the image of F. 


Contributions. This paper makes two main contributions. 

Our first contribution, in Section 3, is to introduce the bicategorical glueing 
construction and, in Section 4, to initiate the development of its theory. As well 
as providing an assurance that our notion is the right one, this establishes the 
basic framework for applications. Importantly, we bicategorify the fundamental 
folklore result (e.g. [40,12,62]) establishing mild conditions under which a glued 
bicategory is cartesian closed. 

Our second contribution, in Section 5, is to employ bicategorical glueing to 
show that for a bicategory B with finite-product completion F* [B] and cartesian- 
closed completion F*? [B], the universal pseudofunctor B + F*:~ [B] and its 
universal finite-product-preserving extension F* |B] > F*? [B] are both locally 
an equivalence. Since one may directly observe that the universal pseudofunc- 
tor B => F* [B] is locally an equivalence, we obtain relative full completeness 
results for bicategorical cartesian closed structure mirroring those of the categori- 
cal setting. Establishing this proof-theoretically would require the development 
of a 2-dimensional proof theory. Given the complexities already present at the 
categorical level this seems a serious and interesting undertaking. Here, once the 
basic bicategorical theory has been established, the proof is relatively compact. 
This highlights the effectiveness of our approach for the application. 

The result may also be expressed type-theoretically. For instance, in terms of 
the type theories of [20], the type theory AP.” for cartesian closed bicategories 
is a conservative extension of the type theory De for finite-product bicategories. 
It follows that, modulo the equational theory of bicategorical products and 
exponentials, any rewrite between STPC-terms constructed using the 67-rewrites 
for both products and exponentials may be equally presented as constructed 
from just the $7-rewrites for products (see [21,55]). 


Further work. We view the foundational theory presented here as the start- 
ing point for future work. For instance, we plan to incorporate further type 
structure into the development, such as coproducts (c.f. [22,16,4]) and monoidal 
structure (c.f. [31]). 

On the other hand, the importance of glueing in the categorical setting 
suggests that its bicategorical counterpart will find a range of applications. A 
case in point, which has already been developed, is the proof of a 2-dimensional 
normalisation property for the type theory A>: for cartesian closed bicategories 
of [20] that entails a corresponding bicategorical coherence theorem [21,55]. There 
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are also a variety of syntactic constructions in programming languages and type 
theory that naturally come with a 2-dimensional semantics (see e.g. the use of 
2-categorical constructions in [23,14,6,61,35]). In such scenarios, bicategorical 
glueing may prove useful for establishing properties corresponding to the notions 
of adequacy and/or canonicity, or for proving further conservativity properties. 


2 Cartesian closed bicategories 


We begin by briefly recapitulating the basic theory of bicategories, including the 
definition of cartesian closure. A summary of the key definitions is in [41]; for a 
more extensive introduction see e.g. [5,7]. 


2.1 Bicategories 


Bicategories axiomatise structures in which the associativity and unit laws of 
composition only hold up to coherent isomorphism, for instance when composition 
is defined by a universal property. They are rife in mathematics and theoretical 
computer science, appearing in the semantics of computation [29,11,49], datatype 
models [1,13], categorical logic [26], and categorical algebra [19,25,18]. 


Definition 1 ([5]). A bicategory B consists of 


1. A class of objects ob(B), 

2. For every X,Y € ob(B) a hom-category (B(X,Y),e,id) with objects 1-cells 
f :X 3 Y and morphisms 2-cells a: f > f' : X + Y; composition of 2-cells 
is called vertical composition, 

3. For every X,Y,Z € ob(B) an identity functor Idx : 1 —> B(X, X) (for 
1 the terminal category) and a horizontal composition functor ox yz : 
B(Y,Z) x B(X,Y) > B(X, Z), 

4. Invertible 2-cells 


ang pi (hog)of=>ho(gof):W >Z 
l; :Idxo f> f:W >X 
rg :goldx >g: X >Y 


for every f: W > X, g: X >Y andh:Y —> Z, natural in each of their 
parameters and satisfying a triangle law and a pentagon law analogous to 
those for monoidal categories. 


A bicategory is said to be locally small if every hom-category is small. 


Example 1. 1. Every 2-category is a bicategory in which the structural isomor- 
phisms are all the identity. 

2. For any category C with pullbacks there exists a bicategory of spans over C [5]. 
The objects are those of C, 1-cells A ~ B are spans (A + X > B), and 
2-cells (A + X > B) > (A + X' > B) are morphisms X —> X’ making the 
expected diagram commute. Composition is defined using chosen pullbacks. 
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A bicategory has three notions of ‘opposite’, depending on whether one 
reverses 1-cells, 2-cells, or both (see e.g. [37, §1.6]). We shall only require the 
following. 


Definition 2. The opposite of a bicategory B, denoted B°?, is obtained by setting 
BP(X,Y) := B(Y,X) for all X,Y € B. 


A morphism of bicategories is called a pseudofunctor (or homomorphism) [5]. 
It is a mapping on objects, 1-cells and 2-cells that preserves horizontal composition 
up to isomorphism. Vertical composition is preserved strictly. 


Definition 3. A pseudofunctor (F,¢,w) : B — C between bicategories B and C 
consists of 


1. A mapping F : ob(B) > ob(C), 

2. A functor Fy y :B(X,Y) > C(FX, FY) for every X,Y € ob(B), 

3. An invertible 2-cell Wx :Idpx = F(Idx) for every X € ob(B), 

4. An invertible 2-cell Of, : F(f) o F(g) > F(f og) for everyg: X >Y and 
f:Y —> Z, natural in f and g, 


subject to two unit laws and an associativity law. A pseudofunctor for which ¢ 
and w are both the identity is called strict. A pseudofunctor is called locally P if 
every functor Fx y satisfies the property P. 


Example 2. A monoidal category is equivalently a one-object bicategory; a 
monoidal functor is equivalently a pseudofunctor between one-object bicate- 
gories. 


Pseudofunctors F,G : B — C are related by pseudonatural transformations. 
A pseudonatural transformation (k,k) : F => G consists of a family of 1-cells 
(kx : FX + GX)xeg and, for every f : X — Y, an invertible 2-cell ky : 
ky o Ff > Gf o kx witnessing naturality. The 2-cells kp are required to be 
natural in f and satisfy two coherence axioms. A morphism of pseudonatural 
transformations is called a modification, and may be thought of as a coherent 
family of 2-cells. 


Notation 1. For bicategories B and C we write Bicat(6,C) for the (possibly 
large) bicategory of pseudofunctors, pseudonatural transformations, and modifi- 
cations (see e.g. [41]). IfC is a 2-category, then so is Bicat(B,C). We write Cat for 
the 2-category of small categories and think of the 2-category Bicat(B°?, Cat) as 
a bicategorical version of the presheaf category Set?” . As for presheaf categories, 
one must take care to avoid size issues. We therefore adopt the convention that 
when considering Bicat(B°?, Cat) the bicategory B is small or locally small as 
appropriate. 


Example 3. For every bicategory B and X € B there exists the representable 
pseudofunctor YX : BP — Cat, defined by YX := B(—, X). The 2-cells ¢ and 
w are structural isomorphisms. 
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The notion of equivalence between bicategories is called biequivalence. A 
biequivalence B ~ C consists of a pair of pseudofunctors F : B S G:C together 
with equivalences FG ~ ide and GF ~ idg in Bicat(C,C) and Bicat(6,B) 
respectively. Equivalences in an arbitrary bicategory are defined by analogy with 
equivalences of categories, see e.g. [42, pp. 28]. 


Remark 1. The coherence theorem for monoidal categories [44, Chapter VII] gen- 
eralises to bicategories: any bicategory is biequivalent to a 2-category [45] (see [42] 
for a readable summary of the argument). We are therefore justified in writing 
simply = for composites of a,l and r. 


As a rule of thumb, a category-theoretic proposition lifts to a bicategorical 
proposition so long as one takes care to weaken isomorphisms to equivalences 
and sprinkle the prefixes ‘pseudo’ and ‘bi’ in appropriate places. For instance, 
bicategorical adjoints are called biadjoints and bicategorical limits are called 
bilimits [59]. The latter may be thought of as limits in which every cone is filled by 
a coherent choice of invertible 2-cell. Bilimits are preserved by representable pseud- 
ofunctors and by right biadjoints. The bicategorical Yoneda lemma [59, §1.9] says 
that for any pseudofunctor P : B°P — Cat, evaluation at the identity determines 
a pseudonatural family of equivalences Bicat(B°?, Cat)(Y X, P) ~ PX. One may 
then deduce that the Yoneda pseudofunctor Y : B + Bicat(B°?, Cat) : X œ> YX 
is locally an equivalence. Another ‘bicategorified’ lemma is the following, which 
we shall employ in Section 5. 


Lemma 1. 1. For pseudofunctors F,G : B —> C, if F ~ G and G is locally an 
equivalence, then so is F. 
2. For pseudofunctors F: A> B, G:B—>C, H:C—>D, if GoF andHoG 


are local equivalences, then so is F. 


2.2 fp-Bicategories 


It is convenient to directly consider all finite products, as this reduces the need 
to deal with the equivalent objects given by re-bracketing binary products. To 
avoid confusion with the ‘cartesian bicategories’ of Carboni and Walters [10,8], 
we call a bicategory with all finite products an fp-bicategory. 


Definition 4. An fp-bicategory (B, II (—)) is a bicategory B equipped with the 
following data for every A1,..., An EB (n€N): 


1. A chosen object [],,(A1,.--,; An), 
2. Chosen arrows tx : [| (41, .--, An) > Ak (k=1,...,n), called projections, 
3. For every X € B an adjoint equivalence 


specified by choosing a family of universal arrows (see e.g. [44, Theorem IV.2]) 


with components wl) fr MO (fiser in SS fe for i= Desn 
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We call the right adjoint (—,...,=) the n-ary tupling. 


Explicitly, the universal property of w = (w,...,@()) is the following. 
For any finite family of 2-cells (a; : m; o g > fi : X > Ai)i=1,...,n, there exists a 
2-cell p'(ay,...,Qn): 9 => (fi,---; fn): X > J],(A1,---,An), unique such that 


k 
ae’ e (Tp 0 pl(ai,...,Qn)) = On: TOG => fk 
for k = 1,...,n. One thereby obtains a functor (—,...,=) and an adjunction 
as in (1) with counit w = (w®,..., w) and unit sg := plidayog,---;idanog) : 
g = (m™ °g,..-,% © g). This defines a lax n-ary product structure: one merely 


obtains an adjunction in (1). One turns it into a bicategorical (pseudo) product by 
further requiring the unit and counit to be invertible. The terminal object 1 arises 
as [[)(). We adopt the same notation as for categorical products, for example by 
writing []j_, A: for [],,(A1,..., An) and J]; fi for (fi o 71,.--; fn © Tn). 


Example 4. The bicategory of spans over a lextensive category [9] has finite 
products; such a bicategory is biequivalent to its opposite, so these are in fact 
biproducts [38, Theorem 6.2]. Biproduct structure arises using the coproduct 
structure of the underlying category (c.f. the biproduct structure of the category 
of relations). 


Remark 2 (c.f. Remark 1). fp-Bicategories satisfy the following coherence the- 
orem: every fp-bicategory is biequivalent to a 2-category with 2-categorical 
products [52, Theorem 4.1]. Thus, we shall sometimes simply write S in diagrams 
for composites of 2-cells arising from either the bicategorical or product structure. 
In pasting diagrams we shall omit such 2-cells completely (c.f. [30, Remark 3.1.16]; 
for a detailed exposition, see [64, Appendix A]). 


One may think of bicategorical product structure as an intensional version 
of the familiar categorical structure, except the usual equations (e.g. [28]) are 
now witnessed by natural families of invertible 2-cells. It is useful to introduce 
explicit names for these 2-cells. 


Notation 2. In the following, and throughout, we write A, for a finite sequence 
(Ay,..-,An). 


Lemma 2. For any fp-bicategory (B,I,(—)) there exist canonical choices for 
the following natural families of invertible 2-cells: 


1. For every (hi : Y > Aj)iai...., 
(hy,..-,Rn) og => (hi og,...,hn og), 

2. For every (hi : Ai > Biji=1,....n and (gi : X > Aili=1,...n; a B-cell 
fuse(he; ge) : (Iiih) © (91, «<2 5G) => (hi © 91,- - Ry © gn)- 


In particular, it follows from Lemma 2(2) that there exists a canonical natural 
family of invertible 2-cells Sn, g, : (Lih) o A19) > I; (hi © gi) for any 
(hi : A; => Bi)i=1,...n and (gj : Xj = Aaya satus 

In the categorical setting, a cartesian functor preserves products up to isomor- 
phism. An fp-pseudofunctor preserves bicategorical products up to equivalence. 


284 M. Fiore and P. Saville 

Definition 5. An fp-pseudofunctor (F,q”*) between fp-bicategories (B, Un(—)) 

and (C,II,(—)) is a pseudofunctor F : B — C equipped with specified equivalences 
(Fm,...,Fa,): FQ Ad) S Ty (FAs) : 4 


for every Aj,...,An E B (n € N). We denote the 2-cells witnessing these 
equivalences by už, : Idg], ra, > (Fm,---,F tm) ° a4, and că, : aa, © 
(Fm,...,F a) => Idirn,a,). We call (F,q*) strict if F is strict and satis- 
fies 


PO, Ais An)) = Th Ay nF As) 


Ainan FA,...,FAn (i) _ _ (i) 
Ffr; ame For. ot, = Ort, ,...,Ftn 
F (1,...,tn) = (Fti,..., Ftn) Ci cess = Idn,(FAj,...,FAn) 
with equivalences given by the 2-cells p'(rz,,-..,0n,) : Id = (Tijs esa Ta) 


Notation 3. For fp-bicategories B and C we write fp-Bicat(B,C) for the bicate- 
gory of fp-pseudofunctors, pseudonatural transformations and modifications.’ 


We define two further families of 2-cells to witness standard properties of 
cartesian functors. The first witnesses the fact that any fp-pseudofunctor com- 
mutes with the [[,,(—,...,=) operation. The second witnesses the equality 
(Fm,..., Fm) oF (fi,.--, fn) = (E fi, ..-, F fn) ‘unpacking’ an n-ary tupling 
from inside F. 


Lemma 3. Let (F,q”*) : (B,ITIn(—)) > (C, IIn(—)) be an fp-pseudofunctor. 

1. For any finite family of 1-cells (fi : Ai > Ai; )i=1....n in B, there exists an 
invertible 2-cell nats, : q%; OT; FA > FAL fi) o qa. such that the pair 
(q*,nat) forms a a pseudonatural transformation 

[Tina (F) -o PF) > (Fo Ti)... =) 


2. For any finite family of 1-cells (fi : X > Bi)i=1,...,n in B, there exists a 
canonical choice of naturally invertible 2-cell unpack, : (F'mm,...,F 1%) © 
F(fi,--+s dn) > (Ffi. F fa) : FX > [h;i Be 


2.3 Cartesian closed bicategories 


A cartesian closed bicategory is an fp-bicategory (B, In(—)) equipped with a 
biadjunction (—) x A 4 (A = —) for every A € B. Examples include the bicategory 
of generalised species [17], bicategories of concurrent games [49], and bicategories 
of operads [26]. 


3 In the categorical setting, every natural transformation between cartesian functors 
is monoidal with respect to the cartesian structure and a similar fact is true bicat- 
egorically: every pseudonatural transformation is canonically compatible with the 
product structure, see [55, § 4.1.1]. 
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Definition 6. A cartesian closed bicategory or cc-bicategory is an fp-bicategory 
(B,I1,(—)) equipped with the following data for every A,B € B: 


1. A chosen object (A=> B), 
2. A specified 1-cell evala,p : (A= B) x A > B, 
3. For every X € B, an adjoint equivalence 


eval 4,po(—x A) 


— 
B(X,A=> B) 1. B(X x A,B) 


od 
Xr 


specified by a choice of universal arrow £p : evala,p o (Af x A) 3 f. 
We call the functor \(—) currying and refer to Af as the currying of f. 


Explicitly, the counit £ satisfies the following universal property. For every 
l-cell g : X > (A = B) and 2-cell a : eval4,g o (g x A) = f there exists a unique 
2-cell el(a) : g = Af such that ef e (eval4, g o (e(a) x A)) = a. This defines a lax 
exponential structure. One obtains a pseudo (bicategorical) exponential structure 
by further requiring that ¢ and the unit m := et(ideval a so(txA)) are invertible. 


Example 5. Every ‘presheaf’ 2-category Bicat(B°P, Cat) has all bicategorical lim- 
its [52, Proposition 3.6], given pointwise, and is cartesian closed with (P => Q)X := 
Bicat(B°?, Cat)(YX x P,Q) [55, Chapter 6]. 


As for products, we adopt the notational conventions that are standard in 
the categorical setting, for example by writing (f => g) : (A= B) > (d'= B’) 
for the currying of (g o evala, B) o (Id =B x f). 

Just as fp-pseudofunctors preserve products up to equivalence, cartesian 
closed pseudofunctors preserve products and exponentials up to equivalence. 


Definition 7. A cartesian closed pseudofunctor or cc-pseudofunctor between 
cc-bicategories (B, Il„(—), =>) and (C,II,(—), =) is an fp-pseudofunctor (F, q% ) 
equipped with specified equivalences ma g : F(A=> B) 5 (FA= FB) : qi’p 
for every A,B € B, where mas : F(A = B) > (FA = FB) is the currying of 
F(eval4 Bg) ° qň Ba A cc-pseudofunctor (F,q*,q®) is strict if (F,q%) is a 
strict fp-pseudofunctor such that 


F(A=> B) = (FA= FB) 
F(eval4,B) = evalF A,F B F (e) = EFt 
F (At) = \(Ft) dap = IdFA=FB 


with equivalences given by the 2-cells 


el(evalpa,rp © K) : Id(FA=FB) Š \(evalF4,FB © Id(FA=FB)xFA) 


where K is the canonical isomorphism Idra =FB X FA  Id(FA=FB)xFA: 
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Remark 8. As is well-known in the case of Cat (e.g. [44, IV.2]), every equivalence 
X ~ Y ina bicategory gives rise to an adjoint equivalence between X and Y 
with the same 1-cells (see e.g. [42, pp. 28-29]). Thus, one may assume without 
loss of generality that all the equivalences in the preceding definition are adjoint 
equivalences. The same observation applies to the definition of fp-pseudofunctors. 


Notation 4. For cc-bicategories B and C we write cc-Bicat(6,C) for the bi- 
category of cc-pseudofunctors, pseudonatural transformations and modifica- 
tions (c.f. Notation 3). 


3 Bicategorical glueing 


The glueing construction has been discovered in various forms, with correspond- 
ingly various names: the notions of logical relation [50,57], sconing [24], Freyd 
covers, and glueing (e.g. [40]) are all closely related (see e.g. [47] for an overview 
of the connections). Originally presented set-theoretically, the technique was 
quickly given categorical expression [43,47] and is now a standard component of 
the armoury for studying type theories (e.g. [40,12]). 

The glueing gl(F) of categories C and D along a functor F : C —> D may 
be defined as the comma category (idp | F). We define bicategorical glueing 
analogously. 


Definition 8. 


1. Let F: AC andG:B-C be pseudofunctors of bicategories. The comma 
bicategory (F | G) has objects triples (A € A, f : FA —> GB,B€B). The 
1-cells (A, f, B) > (A’, f’, B’) are triples (p,a,q), where p : A > A’ and 
q: B —> B' are 1-cells and a is an invertible 2-cella: f'o Fp > Gqo f. 
The 2-cells (p,a,q) = (p',a’,q’) are pairs of 2-cells (o : p> p',T:q4 => q) 
such that the following diagram commutes: 


flo Fip) =O, f'o F(p) 


a| Je (2) 


G(q)o f EEr G(qd')of 


Identities and horizontal composition are given by the following pasting dia- 


grams. 
F(rop) 
A TN 
FA ZS, FA pa 2, pa A ra" 
s| p | a = r £ le 
GB => GB GB => oa z> GB" 
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Vertical composition, the identity 2-cell, and the structural isomorphisms are 
given component-wise. 

2. The glueing bicategory gl(J) of bicategories B and C along a pseudofunctor 
3:B—-C is the comma bicategory (ide | 3). 


We call axiom (2) the cylinder condition due to its shape when viewed as 
a (3-dimensional) pasting diagram. Note that one directly obtains projection 
Tdom ~\ Teod 
pseudofunctors B <= gl(J) “> C. 
We develop some basic theory of glueing bicategories, which we shall put to 


use in Section 5. We follow the terminology of [15]. 


Definition 9. Let 3: B —> X be a pseudofunctor. The relative hom-pseudofunctor 
(J) : & > Bicat(B°?, Cat) is defined by (3)X := X (J(—), X). 


Following [15], one might call the glueing bicategory gl((J)) associated to a 
relative hom-pseudofunctor the bicategory of B-intensional Kripke relations of 
arity J, and view it as an intensional, bicategorical, version of the category of 
Kripke relations. 

The relative hom-pseudofunctor preserves all bilimits that exist in its domain. 
For products, this may be described explicitly. 


Lemma 4. For any fp-bicategory (X,Un(—)) and pseudofunctor J : B —> X, the 
relative hom-pseudofunctor (J) extends canonically to an fp-pseudofunctor. 


Proof. Take q% to be the n-ary tupling []_,¥(3(—), Xi) =, X(3(-), T,X). 
This forms a pseudonatural transformation with naturality witnessed by post. 


_For any pseudofunctor J : B + 4 there exists a pseudonatural transformation 
(1,1): Y > (3) ° J: B > Bicat(B°?, Cat) given by the functorial action of J on 
hom-categories. One may therefore define the following. 


Definition 10. For any pseudofunctor 3: B —> X , define the extended Yoneda 
pseudofunctor Y : B —> gl((3)) by setting YB := (YB, (L, D(-,B), 3B), Yf := 
(Yf, (gap) and Y(T : f > f': B > B’) := (Y7,37). The cylinder 
condition holds by the naturality of ¢3, and the 2-cells ¢% and YX are (¢*, ¢°) 
and (YY, Y), respectively. 


The extended Yoneda pseudofunctor satisfies a corresponding ‘extended 
Yoneda lemma’ (c.f. [15, pp. 33]). 


Lemma 5. For any pseudofunctor J : B —> X and P = (P, (k,k), X) € gl((3)) 
there exists an equivalence of pseudofunctors gl((3))(Y(—),P) ~ P and an 
invertible modification as in the diagram below. Hence Y is locally an equivalence. 


gl((3))(¥(—), B) > P 


le 
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Proof. The arrow marked ~ is the composite of a projection and the equivalence 
arising from the Yoneda lemma. Its pseudo-inverse is the composite 


P = Bicat(B°?, Cat)(Y(—), P) > gl((3))(¥(—), P) (3) 


in which the equivalence arises from the Yoneda lemma and the unlabelled pseud- 
ofunctor takes a pseudonatural transformation (j,j) : YB => P to the triple 
with first component (j,j), third component jp(kp(Idg)) : JB > X and second 
component defined using k and j. Chasing the definitions through and evaluating 
at A,B € B, one sees that when P := YB the composite (3) is equivalent to 
Yap. Since (3) is locally an equivalence, Lemma 1(1) completes the proof. 


4 Cartesian closed structure on the glueing bicategory 


It is well-known that, if C and D are cartesian closed categories, D has pullbacks, 
and F : C > D is cartesian, then gl(F’) is cartesian closed (e.g. [40,12]). In this 
section we prove a corresponding result for the glueing bicategory. We shall be 
guided by the categorical proof, for which see e.g. [43, Proposition 2]. 


4.1 Finite products in gl(J) 


Proposition 1. Let (6,I,(—)) and (C,II,(—)) be fp-bicategories and (3,q”*) : 
B —> C be an fp-pseudofunctor. Then gl(J) is an fp-bicategory with both projection 
pseudofunctors Taom and Teo Strictly preserving products. 


For a family of objects (Ci, ci, Bi)i=1,....n, the n-ary product JJ; (Ci, ci, Bi) 
is defined to be the tuple (]Jj_, Ci ap, © TT; ci TT, Bi). The kth projection 
Tp İS (Tk, Hk, Tk), Where up is defined by commutativity of the following diagram: 


Ck O Tk rr > Irk) o (az, © Ici) 
c| T= 
mpo [lci (Ir: ° q5.) OT]; Ci 
al Tea oa%, Jollee; 
(Tk ° Id], JB; pelk Ci ((nk 0 (IT, ITn)) 0 a5.) of], Ci 


be (Tk [0] (Im, dae bn) (0) q5. )) (0) IL; Ci 


For an n-ary family of 1-cells (gi, a;, fi) : (Y, y, X) > (Ci, ci, Bi) (i = 1,...,n), 
the n-ary tupling is ((g1,---,;9n),{Q1,---, Qn}, (f1,---;fn)), where {a1,...,Qn} 
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is the composite 


(ağ, o [l ci) © (91-3 9n) > Ifi,- fn) ) OY 


dp, © (Tli ci o (g1, ---39n)) Id3(qq B,) ° (O(f1,- ++, fn) 0Y) 

až, ofuse| Vek 03 (disfa) ou 
Wg, © (C1 © 91,- -Cn © Gn) (ak. © (IT, -3 ITn)) 0 Ii- fn) OD) 
aS, ayes an) | Ta 
dp, © (Ifi oY,- -Ifan OY) Ga, © (tig: Stn) OD figs fn)) OY) 


ee Ja%, o (unpack; tov) 
qa%,0post™ 1 dp. O (Ifi, --; fn) oy) 


Finally, for every family of 1-cells (gi, ai, fi) : (Y, y, X) > (Ci, ci, Bi) (i = 
1,...,n) we require a glued 2-cell mg o ((g1,.--,9n);{Q1,---, Qn}, (Jis 3 fn) > 
(Jk, @k, fk) to act as the counit. We take simply (ao), a), This pair forms a 
2-cell in gl(J), and the required universal property holds pointwise. 


Remark 4. If (3,q*) : B — & is an fp-pseudofunctor, then Y : B > gl((J)) canon- 
ically extends to an fp-pseudofunctor. The pseudoinverse to (Y7,..., Y7p) is 
((—,...,=), =, q% ), where the component of the isomorphism at (f; : X > Bi)i=1,...,n 
; = (ch, ) oF (fe) 
is F(f.) = Idr, B) °F (fe) = 


ounpack 


a, (Ene) oF (f.) ===> 4%, 0 (F'fe). 


4.2 Exponentials in gl(J) 


As in the 1-categorical case, the definition of currying in gl(J) employs pullbacks. 
A pullback of the cospan (X; > Xo + X2) in a bicategory B is a bilimit for the 
strict pseudofunctor X : (1 > 0 + 2) — B determined by the cospan. We state 
the universal property in the form that will be most useful for our applications. 


Lemma 6. The pullback of a cospan (X: EEN Xo EEA Xə) in a bicategory B 
is determined, up to equivalence, by the following data and properties: a span 
(Xı &— P © Xə) in B and an invertible 2-cell filling the diagram on the left 
below 


P Q 
7 y H u2 
a 7 x a ag a 
Xı Z X2 Xı = Xə 
pN Xo Mf, AY Xo An 


such that 
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1. for any other diagram as on the right above there exists a fill-in (u, £1, =2), 
namely a 1-cell u : Q — P and invertible 2-cells Z; : yi o u => m; (i = 1,2) 
satisfying 


(honje — = fo (pou E, pop 


Tou) |y 
(Fon) ou ——> fio (mou) => fiom 
2. for any 1-cells v,w : Q > P and 2-cells Vi : yov = yiow (i = 1,2) 
satisfying 


(f2 072) 0¥ —> fro (yov) 2s how) —=> (foo) ow 


eu] [row 


(ei) oe Ju (mov) ee fe Ow) —=— (fron)ow 


there exists a unique 2-cell Y : v > w such that YW = yio Y (i = 1,2). 


Example 6. 1. In Cat, the pullback of a cospan (B Æ X © C) is the full 
subcategory of the comma category (F | G) consisting of objects of the form 
(B, f,C) for which f : FB — GC is an isomorphism. Note that this differs 
from the strict (2-)categorical pullback in Cat, in which every f is required 
to be an identity (c.f. [65, Example 2.1}). 

2. Like any bilimit, pullbacks in the bicategory Bicat(B°?, Cat) are computed 
pointwise (see [53, Proposition 3.6]). 


We now define exponentials in the glueing bicategory. Precisely, we extend 
Proposition 1 to the following. 


Theorem 5. Let (6,II,(—),=>) and (C, T,(—), =>) be cc-bicategories such that 
C has pullbacks. For any fp-pseudofunctor (3,q*) : (B,In(—)) > (C,Tn(—)), 
the glueing bicategory gl(J) has a cartesian closed structure with forgetful pseudo- 
functor Taom : gl(3) > B strictly preserving products and exponentials. 


The evaluation map. We begin by defining the mapping (—) =(=) and the 
evaluation 1-cell eval. For C := (C, c, B), C” := (C’,c’, B’) € gl(3) we set C => C” 
to be the left-hand vertical leg of the following pullback diagram, in which we 


write mgg’ := À(J(evalg, g’) o q5 pB) 
Cac a >, (C= C") 
- Ww 
poe | ge X(cloevalg cr) 
J(B = B’) T (3B = JB’) ——————>> (C= 3B’) 
| A(eval; g ,38/0((3B =3B')xc)) T 


A(eval; g 3B © (JB =JB') x c)) ome pr 
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Example 7. The pullback (4) generalises the well-known definition of a logical rela- 
tion of varying arity [36]. Indeed, where J := (&) is the relative hom-pseudofunctor 
for an fp-pseudofunctor (,q*) : B + Æ between cc-bicategories, A € B and 
X,X' € X, the functor mx,x/(A) takes a l-cell f : RA > (X =X’) in & 
to the pseudonatural transformation YA x ¥(R(—),X) > *(A(—), X’) with 
components AB. A(p: B > A,u: RB —> X) .evalx,x o (f o R(p), u). Intuitively, 
therefore, the pullback enforces the usual closure condition defining a logical 
relation at exponential type, while also tracking the isomorphism witnessing that 
this condition holds (c.f. [36,3,15]). 


Notation 6. For reasons of space—particularly in pasting diagrams—we will 
sometimes write € := evalzg, yp 0 ((JB = JB’) x c) : (JB = JB’) x C > JB’ 
when c: C 7 JB in C. 


The evaluation map evalç o is defined to be (evalc,cœo(qe e X C), Ecc, evalp,p’), 
where the witnessing 2-cell Ecc is given by the pasting diagram below, in which 


the unlabelled arrow is dB =p’ B)° (Desk X ©). 


evalo, o (dee XC) 


de, XE Em 
(CD>C')xC - (C= 0’)xC C’ 


Da | 
Poc! xC a X(cloevalg or) XC 
x Mp prxC MXC + 
Poerxe Š J(B = B') x C > (JB = JB’) x C > (C=JB')xC à 
P 
3(B =B’) xc ~ (JB —>3B’)xc c 
+ x J £ 
J(B = B') x JB > (JB = JB’) x JB = evalo, 3B’ 
mpg pl xJIB Te. 
Va => B’,B) a eval; g xB! 


3 ((B=> B’) x B) e 


Jevalg g’ 


Here the bottom denotes a composite of ®, structural isomorphisms and 
@~', and the top ® denotes a composite of we, x C with instances of &, ~t, 
and the structural isomorphisms. 


The currying operation. Let R := (R,r,Q), C := (C,c, B) and C := (C, €, B’) 
and suppose given a 1-cell (t,a,s):RxC— C”. We construct A(t, a, s) using the 
universal property (4) of the pullback. To this end, we define invertible composites 
Ua and Ta as in the following two diagrams and set Ly := N71 e el(U;! oaoTa): 
A(c! o evale,cv) o At => (A(E) o mB,B’) o (J(As) o r). 
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evalc,3B © ((A(€) o mg g’) o (S(As) o r)) x C Ues Tso (43,2 ° (r x) 


aN 
~ 


(evalc,3g o (A(€) x C)) o (mg. Br o (J(As) o r)) x C Jeso(að po(rXc)) 


ezo(mpg pBr0(T(As)or))xC 
To (mg, o (J(As) or)) x C S(evalz,p: o (As x B)) o (qð,g ° (r x ¢)) 


^ 


~ 


aa 


(eval;B 3B ° (MBB X JB)) o ((J(As) x JB) o (r x c)) 


E(Jevalogx ) (AS) XIB)o(r xe) 


4 


(J(evalg, g) © qae =>B',B)) o ((J(ås) x JIdg) o (r x c)) 


The unlabelled arrow is the canonical composite of nat) s iq, with Prat A(s)xB 
and structural isomorphisms. Ta is then defined using Ua: 


evalc,gp © (A(c o evalc,cr) o At) x C i > cot 
(evalc,zg 0 (A(c o evale.c) x C)) o (A(t) x C) œ o (evalc,o o (A(t) x C)) 


oe we 


(c oevalo.cr) o (A(t) x C 


L> 


Applying the universal property of the pullback (4) to La, one obtains a 1-cell 
lam(t) and a pair of invertible 2-cells Te ~ and Ae filling the diagram 


de,e! > (C= C") 
Yge A(c'oevalc o) 
oms s (C=3B') 
We define A(t, a, s) := (lam(t), Tee: As). 
The counit 2-cell. Finally we come to the counit. For a 1-cell t := (t,a,s) : 


(R,r,Q) x (C,c, B) > (C’,c, B’) the 1-cell eval o (A(t, a,s) x (Cyc, B)) unwinds 
to the pasting diagram below, in which the unlabelled arrow is q% g o (r x c): 
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(evalc,cr © (ge, X C)) o (lam(t) x C) 


am evalo c/0(de ot XC Bs 
RxC bmx (gagh) x g <e, 
Taf RE 
eS 
rxe IAs)xIB Qez! XE 
JQ x JB IAs) x b J(B = B') x JB Eo,o' 2 
x J(As)xJIdg x 
q4Q,B (B = B',B) 
nat 
~ ~ z t r pI 
IQ x B) J(Asx B) (B= 4 )x B) Jevalg py JB 
oy 


(eval pro (As x B)) 


For the counit e, we take the 2-cell with first component e, defined by 


t 
fe 
( 


> evalc,c o (A(t) x C) 


e 


(evalc,c” © (qc, X C)) o (lam(t) x C) = > 


| 


evalo,cr © ((deer © lam(t)) x C) 


evalo co (A, e XC) 


and second component simply £, : evalg g; o (A(s) x B) = s. This pair forms an 
invertible 2-cell in gl(J). One checks this satisfies the required universal property 
in a manner analogous to the 1-categorical case (see [55] for the full details). This 
completes the proof of Theorem 5. 


5 Relative full completeness 


We apply the theory developed in the preceding two sections to prove the relative 
full completeness result. As outlined in the introduction, this corresponds to a 
proof of conservativity of the theory of rewriting for the higher-order equational 
theory of rewriting in STLC over the algebraic equational theory of rewriting in 
STPC. We adapt ‘Lafont’s argument’ [39, Annexe C] from the form presented 
in [16], for which we require bicategorical versions of the free cartesian category 
F*[C] and free cartesian closed category F*:? [C] over a category C. In line with 
the strategy for the STLC (c.f. [12, pp. 173-4]), we deal with the contravariance 
of the pseudofunctor (— => =) by restricting to a bicategory of cc-pseudofunctors, 
pseudonatural equivalences (that is, pseudonatural transformations for which 
each component is a given equivalence), and invertible modifications. We denote 
this with the subscript ~, =. 
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Lemma 7. For any bicategory B, fp-bicategory (C,II,(—)) and cc-bicategory 
(D, Tai); =r] 


1. There exists an fp-bicategory F* |B] and a pseudofunctor n* : B => F* [B] 
such that composition with n* induces a biequivalence 
fp-Bicat(F* [B], C) — Bicat(B,C) 
2. There exists a cc-bicategory F”? [B] and a pseudofunctor n® : B + F>? [B] 


such that composition with n~ induces a biequivalence 


cc-Bicat~ ~(F*'>[B],D) — Bicat(B,D) 


wa 


Proof (sketch). A syntactic construction suffices: one defines formal products 
and exponentials and then quotients by the axioms (see [48, p. 79] or [55]). 


Thus, for any bicategory B, fp-bicategory (C,I,(—)), and pseudofunctor 
F : B —> C there exists an fp-pseudofunctor F# : F*[B] — C and an equivalence 
F#on* ~ F. Moreover, for any fp-pseudofunctor G : F* [B] > C such that 
Gon* ~ F one has G ~ F*. A corresponding result holds for cc-bicategories 
and cc-pseudofunctors. 


Theorem 7. For any bicategory B the universal fp-pseudofunctor ı : F* |B] > 
F> |B] extending n~ is locally an equivalence. Hence n : B > F>? [B] is 
locally an equivalence. 


Proof. Since ı preserves finite products, the bicategory gl((t)) is cartesian closed 
(Theorem 5). The composite K := Y on” : B — gl(()) therefore induces a 
ce-pseudofunctor KË : F”? [B] — gl((c)). 

First observe that (KË o 1) on” ~ KË on™ ~ K = Yon*. Since Y is 
canonically an fp-pseudofunctor (Remark 4), it follows that KË o1 ~ Y. Since Y 
is locally an equivalence (Lemma 5), Lemma 1(1) entails that KË 0 is locally an 
equivalence. 

Next, examining the definition of Y one sees that Taom © Y =, and so 


(Taom © K#) 09° = (Taom oY) oN = t09” = n° 
It follows that Taom 0 KË ~ idzx,+[p], and hence that Taom © K# is also locally 
an equivalence. 


Now consider the composite F* [B] > F*7 [B] © eic(:)) 7) F* [6]. 
By Lemma 1(2) and the preceding, v is locally an equivalence. Finally, it is direct 
from the construction of F* [B] that ņn* is locally an equivalence; thus, so are 
ron” en. 
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Abstract. A systematic theory of structural limits for finite models has 
been developed by NeSetyil and Ossona de Mendez. It is based on the 
insight that the collection of finite structures can be embedded, via a 
map they call the Stone pairing, in a space of measures, where the desired 
limits can be computed. We show that a closely related but finer grained 
space of measures arises — via Stone-Priestley duality and the notion of 
types from model theory — by enriching the expressive power of first- 
order logic with certain “probabilistic operators”. We provide a sound 
and complete calculus for this extended logic and expose the functorial 
nature of this construction. 

The consequences are two-fold. On the one hand, we identify the logical 
gist of the theory of structural limits. On the other hand, our construction 
shows that the duality-theoretic variant of the Stone pairing captures the 
adding of a layer of quantifiers, thus making a strong link to recent work 
on semiring quantifiers in logic on words. In the process, we identify the 
model theoretic notion of types as the unifying concept behind this link. 
These results contribute to bridging the strands of logic in computer sci- 
ence which focus on semantics and on more algorithmic and complexity 
related areas, respectively. 


Keywords: Stone duality - finitely additive measures - structural limits 
- finite model theory - formal languages - logic on words 


1 Introduction 


While topology plays an important role, via Stone duality, in many parts of se- 
mantics, topological methods in more algorithmic and complexity oriented areas 
of theoretical computer science are not so common. One of the few examples, 


* This project has been supported by the European Research Council (ERC) under 
the European Union’s Horizon 2020 research and innovation program (grant agree- 
ment No.670624). Luca Reggio has received an individual support under the grants 
GA17-046305 of the Czech Science Foundation, and No.184693 of the Swiss National 
Science Foundation. 


© The Author(s) 2020 
J. Goubault-Larrecq and B. König (Eds.): FOSSACS 2020, LNCS 12077, pp. 299-318, 2020. 
https: //doi.org/10.1007/978-3-030-45231-5_16 


300 M. Gehrke et al. 


the one we want to consider here, is the study of limits of finite relational struc- 
tures. We will focus on the structural limits introduced by Nešetřil and Ossona 
de Mendez [I5[17]. These provide a common generalisation of various notions of 
limits of finite structures studied in probability theory, random graphs, struc- 
tural graph theory, and finite model theory. The basic construction in this work 
is the so-called Stone pairing. Given a relational signature o and a first-order 


formula vy in the signature ø with free variables v1,...,Un, define 
ee {ae A"| AE v(a)}| (the probability that a random (1) 
= JAJ? assignment in A satisfies ọ). 


Nešetřil and Ossona de Mendez view the map A > (-, A) as an embedding 
of the finite o-structures into the space of probability measures over the Stone 
space dual to the Lindenbaum-Tarski algebra of all first-order formulas in the 
signature ø. This space is complete and thus provides the desired limit objects 
for all sequences of finite structures which embed as Cauchy sequences. 


Another example of topological methods in an algorithmically oriented area 
of computer science is the use of profinite monoids in automata theory. In this 
setting, profinite monoids are the subject of the extensive theory, based on theo- 
rems by Eilenberg and Reiterman, and used, among others, to settle decidability 
questions [I8]. In [4], it was shown that this theory may be understood as an 
application of Stone duality, thus making a bridge between semantics and more 
algorithmically oriented work. Bridging this semantics-versus-algorithmics gap 
in theoretical computer science has since gained quite some momentum, notably 
with the recent strand of research by Abramsky, Dawar and co-workers [2[3]. In 
this spirit, a natural question is whether the structural limits of Nešetřil and Os- 
sona de Mendez also can be understood semantically, and in particular whether 
the topological component may be seen as an application of Stone duality. 


More precisely, recent work on understanding quantifiers in the setting of 
logic on finite words has shown that adding a layer of certain quantifiers 
(such as classical and modular quantifiers) corresponds dually to measure space 
constructions. The measures involved are not classical but only finitely additive 
and they take values in finite semirings rather than in the unit interval. Nev- 
ertheless, this appearance of measures as duals of quantifiers begs the further 
question whether the measure spaces in the theory of structural limits may be 
obtained via Stone duality from a semantic addition of certain quantifiers to 
classical first-order logic. 


The purpose of this paper is to address this question. Our main result is that 
the Stone pairing of NeSetril and Ossona de Mendez is related by a retraction 
to a Stone space of measures, which is dual to the Lindenbaum-Tarski algebra 
of a logic fragment obtained from first-order logic by adding one layer of prob- 
abilistic quantifiers, and which arises in exactly the same way as the spaces of 
semiring-valued measures in logic on words. That is, the Stone pairing, although 
originating from other considerations, may be seen as arising by duality from a 
semantic construction. 
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A foreseeable hurdle is that spaces of classical measures are valued in the unit 
interval [0, 1] which is not zero-dimensional and hence outside the scope of Stone 
duality. This is well-known to cause problems e.g. in attempts to combine non- 
determinism and probability in domain theory (12|. However, in the structural 
limits of Nešetřil and Ossona de Mendez, at the base, one only needs to talk 
about finite models equipped with normal distributions and thus only the finite 
intervals I, = {0, i 2, ..., 1} are involved. A careful duality-theoretic analysis 
identifies a codirected diagram (i.e. an inverse limit system) based on these 
intervals compatible with the Stone pairing. The resulting inverse limit, which we 
denote I, is a Priestley space. It comes equipped with an algebra-like structure, 
which allows us to reformulate many aspects of the theory of structural limits 
in terms of T'-valued measures as opposed to (0, 1]-valued measures. 

The analysis justifying the structure of T is based on duality theory for double 
quasi-operator algebras [7J8]. In the presentation, we have tried to compromise 
between giving interesting topo-relational insights into why T is as it is, and not 
overburdening the reader with technical details. Some interesting features of T, 
dictated by the nature of the Stone pairing and the ensuing codirected diagram, 
are that 


e T is based on a version of [0, 1] in which the rationals are doubled; 
e T comes with section-retraction maps [0,1] — T —» [0, 1]; 
e the map z is lower semicontinuous while the map y is continuous. 


These features are a consequence of general theory and precisely allow us to 
witness continuous phenomena relative to [0,1] in the setting of T. 


Our contribution 


We show that the ambient measure space for the structural limits of Nešetřil 
and Ossona de Mendez can be obtained via “adding a layer of quantifiers” in 
a suitable enrichment of first-order logic. The conceptual framework for seeing 
this is that of types from classical model theory. More precisely, we will see that 
a variant of the Stone pairing is a map into a space of measures with values in a 
Priestley space I. Further, we show that this map is in fact the embedding of the 
finite structures into the space of (0-)types of an extension of first-order logic, 
which we axiomatise. On the other hand, T-valued measures and [0, 1]-valued 
measures are tightly related by a retraction-section pair which allows the transfer 
of properties. These results identify the logical gist of the theory of structural 
limits and provide a new interesting connection between logic on words and the 
theory of structural limits in finite model theory. 


Outline of the paper. In section [2] we briefly recall Stone-Priestley duality, its 
application in logic via spaces of types, and the particular instance of logic on 
words (needed only to show the similarity of the constructions). In Section [3] we 
introduce the Priestley space I with its additional operations, and show that 
it admits [0,1] as a retract. The spaces of T'-valued measures are introduced in 
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Section [4] and the retraction of I onto [0,1] is lifted to the appropriate spaces 
of measures. In Section [5] we introduce the T-valued Stone pairing and make 
the link with logic on words. Further, we compare convergence in the space of 
T-valued measures with the one considered by NeSetril and Ossona de Mendez. 
Finally, in Section [6] we show that constructing the space of [-valued measures 
dually corresponds to enriching the logic with probabilistic operators. 


2 Preliminaries 


Notation. Throughout this paper, if X Ly % Z are functions, their composi- 
tion is denoted g- f. For a subset S C X, fig: S — Y is the obvious restriction. 
Given any set T, (T) denotes its power-set. Further, for a poset P, P? is the 
poset obtained by turning the order of P upside down. 


2.1 Stone-Priestley duality 


In this paper, we will need Stone duality for bounded distributive lattices in the 
order topological form due to Priestley [19]. It is a powerful and well established 
tool in the study of propositional logic and semantics of programming languages, 
see e.g. [DI] for major landmarks. We briefly recall how this duality works. 

A compact ordered space is a pair (X,<) where X is a compact space and < is 
a partial order on X which is closed in the product topology of X x X. (Note that 
such a space is automatically Hausdorff). A compact ordered space is a Priestley 
space provided it is totally order-disconnected. That is, for all x,y € X such that 
x É y, there is a clopen (i.e. simultaneously closed and open) C C X which is 
an up-set for <, and satisfies x € C but y ¢ C. We recall the construction of the 
Priestley space of a distributive lattice be 

A non-empty proper subset F C D is a prime filter if it is (i) upward closed 
(in the natural order of D), (ii) closed under finite meets, and (iii) ifaVbe F, 
either a € F or b € F. Denote by Xp the set of all prime filters of D. By Stone’s 
Prime Filter Theorem, the map 


[-]: D> (Xp), a> [a] ={F € Xp lac F} 


is an embedding. Priestley’s insight was that D can be recovered from Xp, if 
the latter is equipped with the inclusion order and the topology generated by 
the sets of the form [a] and their complements. This makes Xp into a Priestley 
space — the dual space of D — and the map [-] is an isomorphism between 
D and the lattice of clopen up-sets of Xp. Conversely, any Priestley space X 
is the dual space of the lattice of its clopen up-sets. We call the latter the dual 
lattice of X. This correspondence extends to morphisms. In fact, Priestley duality 
states that the category of distributive lattices with homomorphisms is dually 
equivalent to the category of Priestley spaces and continuous monotone maps. 


3 We assume all distributive lattices are bounded, with the bottom and top denoted 
by 0 and 1, respectively. The bounds need to be preserved by homomorphisms. 
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When restricting to Boolean algebras, we recover the celebrated Stone duality 
restricted to Boolean algebras and Boolean spaces, i.e. compact Hausdorff spaces 
in which the clopen subsets form a basis. 


2.2 Stone duality and logic: type spaces 


The theory of types is an important tool for first-order logic. We briefly recall the 
concept as it is closely related to, and provides the link between, two otherwise 
unrelated occurrences of topological methods in theoretical computer science. 

Consider a signature g and a first-order theory T in this signature. For each 
n € N, let Fm, denote the set of first-order formulas whose free variables are 
among U = {v1,..., Un}, and let Mod, (T) denote the class of all pairs (A, a) 
where A is a model of T and a is an interpretation of 0 in A. Then the satis- 
faction relation, (A, a) = y, is a binary relation from Mod, to Fm,. It induces 
the equivalence relations of elementary equivalence = and logical equivalence ~ 
on these sets, respectively. The quotient FO, (7) = Fm,,/* carries a natural 
Boolean algebra structure and is known as the n-th Lindenbaum- Tarski algebra 
of T. Its dual space is Typ,,(T), the space of n-types of T, whose points can 
be identified with elements of Mod,,(T)/=. The Boolean algebra FO(T) of all 
first-order formulas modulo logical equivalence over T is the directed colimit of 
the FO,(T) for n € N while its dual space, Typ(T), is the codirected limit of 
the Typ„(T) for n € N and consists of models equipped with interpretations of 
the full set of variables. 

If we want to study finite models, there are two equivalent approaches: e.g. at 
the level of sentences, we can either consider the theory Tfn of finite T-models, 
or the closure of the collection of all finite T-models in the space Typ)(T). This 
closure yields a space, which should tell us about finite T-structures. Indeed, it is 
equal to Typo(Tyin), the space of pseudofinite T-structures. For an application of 
this, see (10|. Below, we will see an application in finite model theory of the case 
T = Í (in this case we write FO(c) and Typ(c) instead of FO(@) and Typ(Q)). 


In light of the theory of types as exposed above, the Stone pairing of NeSetril 
and Ossona de Mendez (see equation p) can be regarded as an embedding of 
finite structures into the space of probability measures on Typ(c), which set- 
theoretically are finitely additive functions FO(ø) — [0,1]. 


2.3 Duality and logic on words 


As mentioned in the introduction, spaces of measures arise via duality in logic on 
words [5]. Logic on words, as introduced by Büchi, see e.g. [14] for a recent survey, 
is a variation and specialisation of finite model theory where only models based 
on words are considered. I.e., a word w E€ A%* is seen as a relational structure 
on {1,...,|w|}, where |w] is the length of w, equipped with a unary relation 
P,, for each a € A, singling out the positions in the word where the letter a 
appears. Each sentence y in a language interpretable over these structures yields 
a language L, C A* consisting of the words satisfying y. Thus, logic fragments 
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are considered modulo the theory of finite words and the Lindenbaum-Tarski 
algebras are subalgebras of §?(A*) consisting of the appropriate L,’s, cf. [I0] for 
a treatment of first-order logic on words. 

For lack of logical completeness, the duals of the Lindenbaum-Tarski alge- 
bras have more points than those given by models. Nevertheless, the dual spaces 
of types, which act as compactifications and completions of the collections of 
models, provide a powerful tool for studying logic fragments by topological 
means. The central notion is that of recognition, in which, a Boolean subalgebra 
B C §(A*) is studied by means of the dual map n: B(A*) > Xz. Here B(A*) is 
the Stone dual of §?(A*), also known in topology as the Cech-Stone compactifica- 
tion of the discrete space A*, and Xg is the Stone dual of 6. The set A* embeds 
in 6(A*), and 7 is uniquely determined by its restriction 79: A* > Xg. Now, 
Stone duality implies that L C A* is in B iff there is a clopen subset V C Xg so 
that ng '(V) = L. Anytime the latter is true for a map 7) and a language L as 
above, one says that 1 recognises LF] 

When studying logic fragments via recognition, the following inductive step 
is central: given a notion of quantifier and a recogniser for a Boolean algebra 
of formulas with a free variable, construct a recogniser for the Boolean algebra 
generated by the formulas obtained by applying the quantifier. This problem was 
solved in [5], using duality theory, in a general setting of semiring quantifiers. The 
latter are defined as follows: let (S, +,-+, 0s, 1s) be a semiring, and k € S. Given a 
formula w(v), the formula Js v.p (v) is true of a word w € A* iff k = 1g+---+13, 
m times, where m is the number of assignments of the variable v in w satisfying 
w(v). If S = Z/qZ, we obtain the so-called modular quantifiers, and for S the 
two-element lattice we recover the existential quantifier 3. 

To deal with formulas with a free variable, one considers maps of the form 
f: B((A x 2)*) + X (the extra bit in A x 2 is used to mark the interpretation 
of the free variable). In [5] (see also [6]), it was shown that Ly ,) is recognised 
by f iff for every k € S the language Las ,.y.4(v) is recognised by the composite 


gA 2, 8(a((A x 2)*)) 2, 8x), (2) 


where S(x ) is the space of finitely additive S-valued measures on X, and R 
maps w € A* to the measure uw: §?((A x 2)*) + S sending K C (A x 2)* to the 
sum lg+---+1 5, nw,K times. Here, nw, x is the number of interpretations a of 
the free variable v in w such that the pair (w, a), seen as an element of (A x 2)*, 
belongs to K. Finally, S( f) sends a measure to its pushforward along f. 


3 The space [ 


Central to our results is a Priestley space I closely related to [0, 1], in which our 
measures will take values. Its construction comes from the insight that the range 


4 Here, being beyond the scope of this paper, we are ignoring the important role of 
the monoid structure available on the spaces (in the form of profinite monoids or 


BiMs, cf. ). 
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of the Stone pairing (-, A), for a finite structure A and formulas restricted to a 
fixed number of free variables, can be confined to a chain J» = {0, i 2, seagi 
Moreover, the floor functions fmn,n: Imn —> In are monotone surjections. The 
ensuing system {fmn,n: Imn > In | m,n € N} can thus be seen as a codirected 
diagram of finite discrete posets and monotone maps. Let us define I to be the 
limit of this diagram. Then, T is naturally equipped with a structure of Priestley 


space, see e.g. Corollary VI.3.3], and can be represented as based on the set 
{r7 |r € (0,1]} U {4° | g E€ QA [0, I}. 


The order of T is the unique total order which has 0° as bottom element, satisfies 
r* < s* if and only if r < s for x € {-,o}, and such that q° is a cover of q7 
for every rational q € (0,1] (ie. q7 < q°, and there is no element strictly in 
between). In a sense, the values q7 represent approximations of the values of the 
form q°. Cf. Figure [I] The topology of T is generated by the sets of the form 


te ={ceT |p? <a} and lg ={reET|e<q} 


for p,q E€ QA [0, 1] such that q 4 0. The distributive lattice dual to F, denoted 
by L, is given by 


L={L}U(QN (0, 1])°, with L <, q and q <x p for every p < q in QA (0, 1]. 


zl 


+ 9° rie 


Fig. 1. The Priestley space I and its dual lattice L 


3.1 The algebraic structure on T 


When defining measures we need an algebraic structure available on the space of 
values. The space T fulfils this requirement as it comes equipped with a partial 
operation —: dom(—) > T, where dom(—) = {(x,y) ET xT |y < x} and 


ro —s° = (r—s)° a B a ifr—sEQ 


r —s° = (r—s)” r= s- (r—s)~ otherwise. 


In fact, this (partial) operation is dual to the truncated addition on the lattice 
L. However, explaining this would require us to delve into extended Priestley 
duality for lattices with operations, which is beyond the scope of this paper. See 
[9] and also [78] for details. It also follows from the general theory that there 
exists another partial operation definable from —, namely: 


~: dom(-) >T, ary=\/{a-@ |y <a? <a}. 
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Next, we collect some basic properties of — and ~, needed in Section [4] which 
follow from the general theory of [78]. First, recall that a map into an ordered 
topological space is lower (resp. upper) semicontinuous provided the preimage 
of any open down-set (resp. open up-set) is open. 


Lemma 1. If dom(—) is seen as a subspace of I x T°, the following hold: 


1. dom(—) is a closed up-set in T x T°; 

2. both —: dom(—) > T and ~: dom(—) > T are monotone in the first coor- 
dinate, and antitone in the second; 

3. —: dom(—) >T is lower semicontinuous; 

4. ~: dom(—) >T is upper semicontinuous. 


3.2 The retraction T — [0,1] 


In this section we show that, with respect to appropriate topologies, the unit 
interval [0,1] can be obtained as a topological retract of T, in a way which is 
compatible with the operation —. This will be important in Sections [4] and 
where we need to move between [0,1]-valued and I'-valued measures. Let us 
define the monotone surjection given by collapsing the doubled elements: 


g: T > [0,1], r7,r° br. (3) 
The map y has a right adjoint, given by 
ro ifreQ 
r— otherwise. 


:: [0,1] >T, of (4) 
Indeed, it is readily seen that y(y) < x iff y < (x), for all y € T and v € [0,1]. 
The composition y -+ coincides with the identity on [0, 1], i.e. ¿ is a section of y. 
Moreover, this retraction lifts to a topological retract provided we equip I and 


(0, 1] with the topologies consisting of the open down-sets: 


Lemma 2. The map y: T —> [0,1] is continuous and the map +: [0,1] >T is 
lower semicontinuous. 


Proof. To check continuity of y observe that, for a rational q € (0,1), y~'(q, 1] 
and y~1[0,q) coincide, respectively, with the open sets 


(Jite |p € Q^ [0,1] and q < p} and |){Lp~ |p € Q^ (0,1) and p < q}. 


Also, + is lower semicontinuous, for +™!(}q7) = [0, q) whenever q € QN (0, 1]. 


It is easy to see that both y and v preserve the minus structure available on 
T and [0,1] (the unit interval is equipped with the usual minus operation x — y 
defined whenever y < x), that is, 


e q(x — y) = q(x ~ y) = y(x) — yy) whenever y < x in I, and 
e (x — y) = u(x) — (y) whenever y < x in [0,1]. 


Remark. 1: [0,1] — T is not upper semicontinuous because, for every q € 
QN [0,1], =+ (ta?) = {x € [0,1] | 4° < e(x)} = {x € [0,1] | (4°) < £} = fq, 1]. 
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4 Spaces of measures valued in F and in [0,1] 


The aim of this section is to replace [0, 1]-valued measures by I'-valued measures. 
The reason for doing this is two-fold. First, the space of T'-valued measures is 
Priestley (Proposition B, and thus amenable to a duality theoretic treatment 
and a dual logic interpretation (cf. Section 6p. Second, it retains more topological 
information than the space of [0, 1]-valued measures. Indeed, the former retracts 
onto the latter (Theorem [10}. 

Let D be a distributive lattice. Recall that, classically, a monotone function 
m: D > [0,1] is a (finitely additive, probability) measure provided m(0) = 0, 
m(1) = 1, and m(a) + m(b) = m(a V b) + m(a A^ b) for every a,b € D. The latter 
property is equivalently expressed as 


Va,b E€ D, m(a) — m(a ^b) = m(a V b) — m(b). (5) 


We write M;(D) for the set of all measures D — [0,1], and regard it as an 
ordered topological space, with the structure induced by the product order 
and product topology of [0,1]?. The notion of (finitely additive, probability) 
T-valued measure is analogous to the classical one, except that the finite addi- 
tivity property splits into two conditions, involving — and ~. 

Definition 3. Let D be a distributive lattice. A T-valued measure (or simply a 
measure) on D is a function u: D >T such that 

1. (0) = 0° and p(1) =1°, 

2. u is monotone, and 


3. for alla,be D, 
ula) ~ ula db) < ula Vb)-pu(b) and pa) — pa Ab) > plav b) ~ (6). 
We denote by Mr(D) the subspace of TP consisting of the measures p: D >T. 


Since T is a Priestley space, so is T? equipped with the product order and 
topology. Hence, we regard Mr(D) as an ordered topological space, whose topol- 
ogy and order are induced by those of TP. In fact Mr(D) is a Priestley space: 


Proposition 4. For any distributive lattice D, My(D) is a Priestley space. 
Proof. It suffices to show that Mp(D) is a closed subspace of TP. Let 


C12 = {f € T? | 0) =O} FEL? | fA) =H {f EL? | f(a) < FO}. 


a<b 


Note that the evaluation maps eva: T? > T, f + f(a), are continuous for every 
a € D. Thus, the first set in the intersection defining C2 is closed because it 
is the equaliser of the evaluation map evg and the constant map of value 0°. 
Similarly, for the set {f € T? | f(1) = 1°}. The last one is the intersection 
of the sets of the form (eva,evp)~'(<), which are closed because < is closed in 
T x T. Whence, C1,2 is a closed subset of T?. Moreover, 


Mr(D) = (} {f € C12 | fla) ~ flab) < f(avb) — f(b} 


a,bE D 


308 M. Gehrke et al. 


N () {f € C12 | F(a) — F(a nb) > fav b) ~ FO}. 
a,bE D 
From semicontinuity of — and ~ (Lemma [1} and the following well-known fact 
in order-topology we conclude that Mr(D) is closed in FP. 


Fact. Let X,Y be compact ordered spaces, f: X — Y a lower semicontinuous 
function and g: X — Y an upper semicontinuous function. If X’ is a closed 
subset of X, then so is E = {x € X’ | g(a) < f(ax)}. 


Next, we prove a property which is very useful when approximating a frag- 
ment of a logic by smaller fragments (see, e.g., Section [5.1p. Let us denote by 
DLat the category of distributive lattices and homomorphisms, and by Pries 
the category of Priestley spaces and continuous monotone maps. 


Proposition 5. The assignment D œ Mp(D) yields a contravariant functor 
Mr: DLat — Pries which sends directed colimits to codirected limits. 


Proof. If h: D > E is a lattice homomorphism and u: EF — T is a measure, it 
is not difficult to see that Mp(h)(u) = w-h: D >T is a measure. The mapping 
Mr(h): Mr(E) > Mp(D) is clearly monotone. For continuity, recall that the 
topology of Mr(D) is generated by the sets [a < q] = {v: D >T | v(a) < 4°} 
and Ja > q] ={v: D >T | v(a) > @°}, with a € D and q € QA (0, 1]. We have 


Mr(h)™ (fa < ql) = {u: E >T | u(h(a)) < 4°} = [h(a) < a] 


which is open in Mr(E). Similarly, Mp(h)~1(fa > q]) = [h(a) > q], showing 
that Mr(h) is continuous. Thus, Mr is a contravariant functor. 
The rest of the proof is a routine verification. 


Remark 6. We work with the contravariant functor Mr: DLat — Pries be- 
cause Mr is concretely defined on the lattice side. However, by Priestley duality, 
DLat is dually equivalent to Pries, so we can think of Mr as a covariant functor 
Pries > Pries (this is the perspective traditionally adopted in analysis, and also 
in the works of Nešetřil and Ossona de Mendez). From this viewpoint, Section [6] 
provides a description of the endofunctor on DLat dual to Mr: Pries > Pries. 


Recall the maps y: I — [0,1] and :: [0,1] — T from equations (3)-(4). In 
Section [3.2] we showed that this is a retraction-section pair. In Theorem [10]this 
retraction is lifted to the spaces of measures. We start with an easy observation: 


Lemma 7. Let D be a distributive lattice. The following statements hold: 


1. for every u E€ Mr(D), y- u E Mı(D), 
2. for every m E€ Mı(D), t-m E Mr(D). 


Proof. 1. The only non-trivial condition to verify is finite additivity. In view of 
the discussion after Lemma | the map y preserves both minus operations on PI. 
Hence, for every a,b € D, the inequalities u(a) ~ u(a Ab) < ula V b) — u(b) and 
p(a)—u(a^b) > (avb) ~ u(b) imply that y:u(a)—y:u(a^b) = y:u(avb)—y:u(b). 
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2. The first two conditions in DefinitionB]are immediate. The third condition 
follows from the fact that u(r — s) = u(r) — (s) whenever s < r in [0,1], and 
z~ y< g- y for every (x,y) E€ dom(-). 


In view of the previous lemma, there are well-defined functions 


y#: Mr(D) > Mı(D), poy: and #: M(D) > Mr(D), m i-m. 


Lemma 8. y#: Mr(D) > Mı(D) is a continuous and monotone map. 
Y 


Proof. The topology of Mı(D) is generated by the sets of the form {m € 
M,(D) | m(a) € O}, for a € D and O an open subset of [0,1]. In turn, 


(*) {m € Mı(D) | m(a) € O} = {uv € Mr (D) | ula) € 77 (0)} 


is open in Mr(D) because y: F —> [0,1] is continuous by Lemma [2] This shows 
that y#: Mr(D) — Mı(D) is continuous. Monotonicity is immediate. 


Note that y#: Mp(D) > Mı(D) is surjective, since it admits 1% as a (set- 
theoretic) section. It follows that Mj (D) is a compact ordered space: 


Corollary 9. For each distributive lattice D, M1(D) is a compact ordered space. 
Proof. The surjection y*: Myp(D) 4 Mı(D) is continuous (Lemma [s). Since 


Mp(D) is compact by Proposition [4| so is M;(D). The order of My;(D) is clearly 
closed in the product topology, thus M;(D) is a compact ordered space. 


Finally, we see that the set-theoretic retraction of Mr(D) onto M;(D) lifts to 
the topological setting, provided we restrict to the down-set topologies. If (X, <) 
is a partially ordered topological space, write X} for the space with the same 
underlying set as X and whose topology consists of the open down-sets of X. 


Theorem 10. The maps y*: Mp(D)+ > M,(D)* andi#: Mı(D)} > Mp(D)+ 
are a retraction-section pair of topological spaces. 


Proof. It suffices to show that y# and ¿# are continuous. It is not difficult to see, 
using Lemma|g} that y#: Mr(D)* + M;(D)* is continuous. For the continuity 
of 1%, note that the topology of Mr(D)* is generated by the sets of the form 
{u € Mr(D) | ula) < q}, for a € D and q € QN (0, 1]. We have 


(07) {u € Mr(D) | ula) < q7} = {m € Mi(D) | m(a) € oa) 
= {m € Mı(D) | m(a) < q}, 


which is an open set in M;(D)*. This concludes the proof. 


5 The I-valued Stone pairing and limits of finite 
structures 


In the work of Nešetřil and Ossona de Mendez, the Stone pairing (-, A) is [0, 1]- 
valued, i.e. an element of M;(FO(c)). In this section, we show that basically the 
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same construction for the recognisers arising from the application of a layer of 
semiring quantifiers in logic on words (cf. Section provides an embedding 
of finite o-structures into the space of T-valued measures. It turns out that 
this embedding is a T'-valued version of the Stone pairing. Hereafter we make a 
notational difference, writing (-,-), for the (classical) [0,1]-valued Stone pairing. 

The main ingredient of the construction are the T-valued finitely supported 
functions. To start with, we point out that the partial operation — on T uniquely 
determines a partial “plus” operation on T. Define 


+:dom(+) >T, where dom(+) = {(z,y) | x < 1° -— y}, 
by the following rules (whenever the expressions make sense): 
r°+s° = (r+s)°, r-+s° =(r+s)~, r°+s7 = (r+s) , and r~+s~ = (r+s)-. 
Then, for every y € I, the function (-) + y sending x to x + y is left adjoint to 
the function (-) — y sending x to x — y. 
Definition 11. For any set X, F(X) is the set of all functions f: X >T s.t. 


1. the set supp(f) = {x € X | f(x) # 0°} is finite, and 
2. f(ai)+-:-+f(an) is defined and equal to 1°, where supp(f) = {x1,...,%n}. 


To improve readability, if the sum yı + --- + Ym exists in T, we denote it 
yi yi. Finitely supported functions in the above sense always determine mea- 
sures over the power-set algebra (the proof is an easy verification and is omitted): 


Lemma 12. Let X be any set. There is a well-defined mapping f: F(X) > 
Mr(9(X)), assigning to every f € F(X) the measure 


Jf: Me Jy f = LAFE) | £ E€ MN supp(f)}. 


5.1 The [-valued Stone pairing and logic on words 


Fix a countably infinite set of variables {v1, v2,...}. Recall that FO,,(c) is the 
Lindenbaum-Tarski algebra of first-order formulas with free variables among 
{vui,...,Un}. The dual space of FO,(c) is the space of n-types Typ,,(c). Its 
points are the equivalence classes of pairs (A, aœ), where A is a o-structure and 
a: {v1,...,Un} > A is an interpretation of the variables. Write Fin(o) for the 
set of all finite o-structures and define a map Fin(a) > F(Typ,(c)) as Aw fA, 
where f is the function which sends an equivalence class E € Typ,,(a) to 


TACE) _ D 1 \° (Age jap for every mterpneauion a of the free 

alee |A|” variables s.t. (A,a) is in the equivalence class). 
By Lemma we get a measure f fA: §9(Typ,(c)) > P. Now, for each y € 
FO,,(c), let [p]n C Typ,(o) be the set of (equivalence classes of) o-structures 


with interpretations satisfying y. By Stone duality we obtain an embedding 
[-Jn: FOn(c) > (Typ,(c)). Restricting f f} to FOn(c), we get a measure 


A. A 
Hn: FOn(o) >T, P> Sion Ía: 
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Summing up, we have the composite map 
Fin(c) > Mr((Typ,(¢))) > Mr(FOn(7)), A f fa = ut. (6) 
Essentially the same construction is featured in logic on words, cf. equation p): 


e The set of finite o-structures Fin(o) corresponds to the set of finite words A*. 

e The collection Typ„ (c) of (equivalence classes of) o-structures with interpre- 
tations corresponds to (A x 2)* or, interchangeably, 8(A x 2)* (in the case of 
one free variable). 

e The fragment FO,,(c) of first-order logic corresponds to the Boolean algebra 
of languages, defined by formulas with a free variable, dual to the Boolean 
space X appearing in (2). 

e The first map in the composite (6p sends a finite structure A to the measure 

f fA which, evaluated on K C Typ,,(c), counts the (proportion of) interpre- 

tations a: {v1,..., Un} > A such that (A, a) € K, similarly to R from (2). 

Finally, the second map in (6) sends a measure in Mr(§?(Typ,,(a))) to its 

pushforward along [-]n: FOn(c) => §(Typ,,(c)). This is the second map in 

the composition (2). 


On the other hand, the assignment A +> yA defined in (6) is also closely 
related to the classical Stone pairing. Indeed, for every formula y in FO, (o), 


m= 3 A= dd (ae) 


E€l[eln Ec[y]n (A.a)ek 


_ (Mme ALAR aly? _ 
a EEI) = eA). (7) 


In this sense, uA can be regarded as a I-valued Stone pairing, relative to the 
fragment FO, (0). Next, we show how to extend this to the full first-order logic 
FO(c). First, we observe that the construction is invariant under extensions of 
the set of free variables (the proof is the same as in the classical case). 


Lemma 13. Given m,n € N and A E Fin(c), ifm > n then (uA) FO, (0) = uź. 


The Lindenbaum-Tarski algebra of all first-order formulas FO(ø) is the directed 
colimit of the Boolean subalgebras FO„ (0), for n € N. Since the functor Mr 
turns directed colimits into codirected limits (Proposition 5), the Priestley space 
Mp(FO(c)) is the limit of the diagram 


{ Mr(FOn(o)) E= Mp(FOm(o)) | m,n EN, m> n} 


where, for any u: FOm(o) > T in Mr(FOm(0)), the measure qn,m( u) is the 
restriction of u to FO,(c). In view of Lemma{13} for every A € Fin(c), the tuple 
(uA)nen is compatible with the restriction maps. Thus, recalling that limits in 
the category of Priestley spaces are computed as in sets, by universality of the 
limit construction, this tuple yields a measure 


(-,A)p: FO(o) >T 
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in the space Myp(FO(o)). This we call the P-valued Stone pairing associated with 
A. As in the classical case, it is not difficult to see that the mapping A > (-, A) 
gives an embedding (-,-),: Fin(a) —> Mp(FO(o)). The following theorem 
illustrates the relation between the classical Stone pairing (-,-),: Fin(o) @ 
M_(FO(c)), and the I-valued one. 


Theorem 14. The following diagram commutes: 
Mr(FO(o)) 


Voie 
a 


Mi(FO(o)) 


Proof. Fix an arbitrary finite structure A € Fin(c). Let p be a formula in 
FO(c) with free variables among {v1,...,Un}, for some n € N. By construction, 
(p, A)p = uå (p). Therefore, by equation (7), (p, A)r = (vp, A))°. The state- 
ment then follows at once. 


Remark. The construction in this section works also for proper fragments, 
i.e. for sublattices D C FO(c). This corresponds to composing the embedding 
Fin(o) > Mp(FO(c)) with the restriction map Mr(FO(c)) > Mr(D) send- 
ing u: FO(c) >T to ppp: D >T. The only difference is that the ensuing map 
Fin(o) + Mp(D) need not be injective, in general. 


5.2 Limits in the spaces of measures 


By Theorem [14] the T-valued Stone pairing (-,-), and the classical Stone pair- 
ing (-,-),; determine each other. However, the notions of convergence asso- 
ciated with the spaces Mp(FO(c)) and M;(FO(c)) are different: since the 
topology of Mr(FO(c)) is richer, there are “fewer” convergent sequences. Re- 
call from Lemma [8] that y#*#: Myp(FO(c)) 4 M;(FO(c)) is continuous. Also, 
y#(({-, A)r) = (-, A), by Theorem|14| Thus, for any sequence of finite structures 
(An) nen; if 
(-, An)p converges to a measure u in Mp(FO(c)) 
then 
(-, An); converges to the measure 7* (u) in M;(FO(c)). 


The converse is not true. For example, consider the signature o = {<} con- 
sisting of a single binary relation symbol, and let (An)nen be the sequence of 
finite posets displayed in the picture below. 


1 l i g i i 
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Let y(x) ~ Vy r(x < y) Adz7(2 < x) A7(z = x) be the formula stating that 
x is maximal but not the maximum in the order given by <. Then, for the 
sublattice D = {f, Y, t} of FO(c), the sequences (-, A,), and (-, A,), converge 
in Mp(D) and M;(D), respectively. However, if we consider the Boolean algebra 
B= {f, Y, 77, t}, then the (-, A,,),’s still converge whereas the (-, A,,),’s do not. 
Indeed, the following sequence does not converge in P: 


(V, Andp)n = (1°, (3)°, 1°, (2) 1°, (8), --), 
because the odd terms converge to 1°, while the even terms converge to 17. 
However, there is a sequence (-, B,,) whose image under y*# coincides with the 
limit of the (-,An),’s (e.g., take the subsequence of even terms of (An)nen). In 
the next theorem, we will see that this is a general fact. 

Identify Fin(o) with a subset of Mp(FO(c)) (resp. My(FO(c))) through 
(-,-)p (resp. (-,-);). A central question in the theory of structural limits, cf. [L6], 
is to determine the closure of Fin(o) in M;(FO(c)), which consists precisely of 
the limits of sequences of finite structures. The following theorem gives an answer 
to this question in terms of the corresponding question for Mp(FO(c)). 


Theorem 15. Let Fin(a) denote the closure of Fin(o) in Mp(FO(a)). Then 
the set y*(Fin(c)) coincides with the closure of Fin(c) in M1(FO(c)). 


Proof. Write U for the image of (-,-), : Fin(a) ~ Mp(FO(c)), and V for the 
image of (-,-), : Fin(o) => M;,(FO(c)). We must prove that y#(U) = V. By 
Tee. #(U) = V. The map 7#*: Mp(FO(c)) > Mj(FO(c)) is con- 
tinuous (Lemma [S), and the spaces Mp(FO(c)) and M;(FO(c)) are compact 
Hausdorff (Proposition [and Corollary [9). Since continuous maps between com- 
pact Hausdorff spaces are closed, y*(U) = y#(U) = V. 


6 The logic of measures 


Let D be a distributive lattice. We know from Proposition |4| that the space 
Mry(D) of T-valued measures on D is a Priestley space, whence it has a dual 
distributive lattice P(D). In this section we show that P(D) can be represented 
as the Lindenbaum-Tarski algebra for a propositional logic P£p obtained from 
D by adding probabilistic quantifiers. Since we adopt a logical perspective, we 
write f and t for the bottom and top elements of D, respectively. 

The set of propositional variables of P£p consists of the symbols Ps, a, for 
every a € D and p € QN (0, 1]. For every measure pp E€ Mp(D), we set 


LE Pspa & ula) p. (8) 


This satisfaction relation extends in the obvious way to the closure under finite 
conjunctions and finite disjunctions of the set of propositional variables. Define 


pHy iff Vue Mr(D), u} ¢ implies u = Y. 


Also, write = ọ if u = ¢ for every u E€ Mp(D), and ọ f= if there is no u € 
Mp(D) with u = y. 
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Consider the following conditions, for any p,q,r E€ QN [0,1] and a,b € D. 


L1) Ps,a — Ps, a whenever p < q 

L2) P>pf H| whenever p > 0, = Psof and } Ps,t 
3) P>4a — P>qb whenever a < b 
4) PspaA Psqb H P>pig—r (aVb) V Ps, (a^b) whenever 0 < p+q-r < 1 
L5) P>p+q-r (aVb) APs, (a^b) = P>p a V Ps, b whenever 0 < p+q-r < 1 


It is not hard to see that the interpretation in validates these conditions: 
Lemma 16. The conditions (L1)-(L5) are satisfied in Mr(D). 
Write P(D) for the quotient of the free distributive lattice on the set 
{Ps,a|p€QN[0, 1], a€ D} 
with respect to the congruence generated by the conditions (L1)—(L5). 
Proposition 17. Let F C P(D) be a prime filter. The assignment 


ar gq |P>ga €E F defines a measure up: D >T. 
Ví | P>4 H 


Proof. Items (L2) and (L3) take care of the first two conditions defining T-valued 
measures (cf. Definition B). We prove the first half of the third condition, as the 
other half is proved in a similar fashion. We must show that, for every a,b € D, 


Lp(a) ~ upla ^b) < pr(aV b) — ur (b). (9) 


It is not hard to show that ur(a)— r° = V{p} — r° | r° < p? < ur(a)}, and 
x — (-) transforms non-empty joins into meets (this follows by Scott continuity 
of x — (-) seen as a map [0°, x] > TP’). Hence, equation (op is equivalent to 


VV {° -r° | urla Ab) <r? <p? < prla)} < A {urla Vd) -a 19° < url}. 
To settle this inequality it is enough to show that, provided up(a ^b) < r° < 
p° < pr(a) and q° < up(b), we have (p — r)° < uprla V b) — q°. The latter 
inequality is equivalent to (p + q — r)° < ur(aV b). In turn, using (L4) and 
the fact that F is a prime filter, P>pa, P>qb € F and P>, (a A b) ¢ F entail 
P>p+q—r (a V b) € F. Whence, 


urlaVb)= V{s° |P >s (a Vb) E€ F} > (p+q-ry°. 


We can now describe the dual lattice of Mr(D) as the Lindenbaum-Tarski 
algebra for the logic P£p, built from the propositional variables Ps, a by im- 
posing the laws (L1)-(L5). 


Theorem 18. Let D be a distributive lattice. Then the lattice P(D) is isomor- 
phic to the distributive lattice dual to the Priestley space Mp(D). 


Proof. Let Xpp) be the space dual to P(D). By solve lease is a map 
v: Xp(p) > Mr(D), F > ur. We claim that J is an isomorphism of Priestley 
space. Clearly, V is monotone. If ur, (a) £ wr,(a) for some a € D, we have 


V {0 | P>4a € Fit = nr (a) £ ur (a) = N {07 | Papa ¢ Fo}. (10) 
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Equation implies the existence of p, q satisfying Ps, a € Fi, P>pa ¢ F> and 
q > p. It follows by (L1) that P>pa € Fi. We conclude that Ps,a € Fi \ Fo, 
whence Fı Z Fy. This shows that V is an order embedding, whence injective. 
We prove that J is surjective, thus a bijection. Fix a measure u E€ Myp(D). 
It is not hard to see, using Lemma [16] that the filter F,, C P(D) generated by 


{P>qa |a € D, qE QNA [0,1], ula) >°} 
is prime. Further, 0(F,)(a) = V {4° | P>qa E€ Fu} = V {8 | ula) > 4°} = ula) 
for every a € D. Hence, V(F„) = p and V is surjective. 
To settle the theorem it remains to show that J is continuous. Note that for 


a basic clopen of the form C = {u € Mp(D) | u(a) > p°} where a € D and 
p € QN (0, 1], the preimage 9-1(C) = {F C P(D) | ur(a) > p°} is equal to 


{F € Xpcp) | V {2 | P>4a € F} > p°} = {F € Xen) | Papa € F}, 


which is a clopen of Xp(p). Similarly, if C = {u € Mr(D) | u(a) < q7 } for some 
a € D and q € QN (0, 1], by the claim above J~!(C) = {F € Xp(n) | P>qa ¢ F}, 
which is again a clopen of Xp(p). 


By Theorem [18] for any distributive lattice D, the lattice of clopen up-sets 
of My(D) is isomorphic to the Lindenbaum-Tarski algebra P(D) of our positive 
propositional logic P£ p. Moving from the lattice of clopen up-sets to the Boolean 
algebra of all clopens logically corresponds to adding negation to the logic. The 
logic obtained this way can be presented as follows. Introduce a new propositional 
variable P<, a, for each a € D and q € QA (0, 1]. For a measure u € Mp(D), set 


LEPa & pla) < g. 


We also add a new rule, stating that P<, a is the negation of Ps, a: 
(L6) PegaAPsqga- and = P<qa V Psa 


Clearly, (L6) is satisfied in Mr(D). Moreover, the Boolean algebra of all clopens 
of Mr(D) is isomorphic to the quotient of the free distributive lattice on 


{P>pa |p E QN 0,1], ac D}U{Pegb|q EQN [0,1], b€ D} 
with respect to the congruence generated by the conditions (L1)-(L6). 


Specialising to FO(c). Let us briefly discuss what happens when we instantiate D 
with the full first-order logic FO(c). For a formula y € FO(c) with free variables 
U1,..+,Un and aq E€ QN (0, 1], we have two new sentences Ps, y and Peg p. For 
a finite o-structure A identified with its T-valued Stone pairing (-, A), 


AEPsqy (resp. AE Pegy) iff (vy, A). >@° (resp. (p, A)r < 9°). 


That is, P>q y is true in A if a random assignment of the variables v1,..., Un in A 
satisfies y with probability at least q. Similarly for P<, p. If we regard Ps, and 
Pq as probabilistic quantifiers that bind all free variables of a given formula, 
the Stone pairing (-,-),: Fin + Mp(FO(c)) can be seen as the embedding of 
finite structures into the space of types for the logic PLpo (a). 
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Conclusion 


Types are points of the dual space of a logic (viewed as a Boolean algebra). In 
classical first-order logic, 0-types are just the models modulo elementary equiv- 
alence. But when there are not ‘enough’ models, as in finite model theory, the 
spaces of types provide completions of the sets of models. 

In B], it was shown that for logic on words and various quantifiers we have 
that, given a Boolean algebra of formulas with a free variable, the space of types 
of the Boolean algebra generated by the formulas obtained by quantification 
is given by a measure space construction. Here we have shown that a suitable 
enrichment of first-order logic gives rise to a space of measures Mp(FO(c)) 
closely related to the space M;(FO(c)) used in the theory of structural limits. 
Indeed, Theorem [14] tells us that the ensuing Stone pairings interdetermine each 
other. Further, the Stone pairing for Mr(FO(c)) is just the embedding of the 
finite models in the completion/compactification provided by the space of types 
of the enriched logic. 

These results identify the logical gist of the theory of structural limits, and 
provide a new and interesting connection between logic on words and the theory 
of structural limits in finite model theory. But we also expect that it may prove 
a useful tool in its own right. Thus, for structural limits, it is an open problem to 
characterise the closure of the image of the [0, 1]-valued Stone pairing [I6]. Rea- 
soning in the T’-valued setting, native to logic and where we can use duality, one 
would expect that this is the subspace Mp(Th(Fin)) of Mp(FO(o)) given by the 
quotient FO(c) > Th(Fin) onto the theory of pseudofinite structures. The pur- 
pose of such a characterisation would be to understand the points of the closure 
as “generalised models”. Another subject that we would like to investigate is that 
of zero-one laws. The zero-one law for first-order logic states that the sequence 
of measures for which the nth measure, on a sentence w, yields the proportion of 
n-element structures satisfying y, converges to a {0, 1}-valued measure. Over T 
this will no longer be true as 1 is split into its ‘limiting’ and ‘achieved’ personae. 
Yet, we expect the above sequence to converge also in this setting and, by The- 
orem it will converge to a {0°, 17, 1° }-valued measure. Understanding this 
more fine-grained measure may yield useful information about the zero-one law. 

Further, it would be interesting to investigate whether the limits for schema 
mappings introduced by Kolaitis et al. may be seen also as a type-theoretic 
construction. Finally, we would want to explore the connections with other se- 
mantically inspired approaches to finite model theory, such as those recently put 
forward by Abramsky, Dawar et al. PJB]. 
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Abstract. We present semantic correctness proofs of Automatic Differ- 
entiation (AD). We consider a forward-mode AD method on a higher 
order language with algebraic data types, and we characterise it as the 
unique structure preserving macro given a choice of derivatives for basic 
operations. We describe a rich semantics for differentiable programming, 
based on diffeological spaces. We show that it interprets our language, 
and we phrase what it means for the AD method to be correct with re- 
spect to this semantics. We show that our characterisation of AD gives 
rise to an elegant semantic proof of its correctness based on a gluing 
construction on diffeological spaces. We explain how this is, in essence, 
a logical relations argument. Finally, we sketch how the analysis extends 


to other AD methods by considering a continuation-based method. 


1 Introduction 


Automatic differentiation (AD), loosely speaking, is the process of taking a pro- 
gram describing a function, and building the derivative of that function by ap- 
plying the chain rule across the program code. As gradients play a central role in 
many aspects of machine learning, so too do automatic differentiation systems 
such as TensorF low [1] or Stan [6]. 


. id te t ti 
Differentiation has a well POED 
; differentiation 

developed mathematical the- Programs Programs 
ory in terms of differential ge- denotational denotational 
ometry. The aim of this paper semantics math semantics 
is to formalize this connec- Differential | differentiation | Differential 
tion between differential ge- geometry geometry 


ometry and the syntactic op- 
erations of AD. In this way we 
achieve two things: (1) a com- 


Fig. 1. Overview of semantics/correctness of AD. 


positional, denotational understanding of differentiable programming and AD; 
(2) an explanation of the correctness of AD. 

This intuitive correspondence (summarized in Fig. 1) is in fact rather com- 
plicated. In this paper we focus on resolving the following problem: higher order 
functions play a key role in programming, and yet they have no counterpart in 
traditional differential geometry. Moreover, we resolve this problem while retain- 


ing the compositionality of denotational semantics. 
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Higher order functions and differentiation. A major application of higher 
order functions is to support disciplined code reuse. Code reuse is particularly 
acute in machine learning. For example, a multi-layer neural network might be 
built of millions of near-identical neurons, as follows. 1 


neuron, : (real”*(real”*real)) — real pi 
e S 0.54 
neuron, = Alx, (w,b)).s(w +a +b) v 


layer, : ((T1*P) > T2) > (M*P") > Ty 


0- 
layer, “ Af. A(x, (p1,---,Pn)). (F(T, p1), ---, f(x, Dn)) “5 ò 5 
comp : (((71*P) > T2)*((T2*Q) > T3)) > (1*(P*Q)) > T3 
comp = A(f, g). Alz, (p,q). gf (£, p) 4) 
def 


(Here ç(x) = g is the sigmoid function, as illustrated.) We can use these 
functions to build a network as follows (see also Fig. 2): 


comp (layer, (neurong), comp(layer, (neuron,,), neuron, )) : (real**P) — real 


Here P = real? with p = (m(k+1)+n(m+1)+n+41). 
This program (1) describes a smooth (infinitely dif- 
ferentiable) function. The goal of automatic differ- 
entiation is to find its derivative. 

If we -reduce all the A’s, we end up with a very 
long function expression just built from the sigmoid 
function and linear algebra. We can then find a pro- 
gram for calculating its derivative by applying the 
chain rule. However, automatic differentiation can 
also be expressed without first G-reducing, in a com- 
positional way, by explaining how higher order func- 
tions like (layer) and (comp) propagate derivatives. 
This paper is a semantic analysis of this compositional approach. 

The general idea of denotational semantics is to interpret types as spaces 
and programs as functions between the spaces. In this paper, we propose to use 
diffeological spaces and smooth functions [32,16] to this end. These satisfy the 
following three desiderata: 


Fig. 2. The network in (1) 
with k inputs and two hid- 
den layers. 


— Risa space, and the smooth functions R —> R are exactly the functions that 
are infinitely differentiable; 

— The set of smooth functions X — Y between spaces again forms a space, so 
we can interpret function types. 

— The disjoint union of a sequence of spaces again forms a space, so we can 
interpret variant types and inductive types. 


We emphasise that the most standard formulation of differential geometry, using 
manifolds, does not support spaces of functions. Diffeological spaces seem to us 
the simplest notion of space that satisfies these conditions, but there are other 
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candidates [3, 33]. A diffeological space is in particular a set X equipped with a 
chosen set of curves Cy C X® and a smooth map f : X — Y must be such that 
if y € Cx then 7; f € Cy. This is remiscent of the method of logical relations. 


From smoothness to automatic derivatives at higher types. Our denota- 
tional semantics in diffeological spaces guarantees that all definable functions are 
smooth. But we need more than just to know that a definable function happens 
to have a mathematical derivative: we need to be able to find that derivative. 

In this paper we focus on a simple, forward mode automatic differentiation 
method, which is a macro translation on syntax (called D in §2). We are able to 
show that it is correct, using our denotational semantics. 

Here there is one subtle point that is central to our development. Although 
differential geometry provides established derivatives for first order functions 
(such as neuron above), there is no canonical notion of derivative for higher order 
functions (such as layer and comp) in the theory of diffeological spaces (e.g. [7]). 
We propose a new way to resolve this, by interpreting types as triples (X, X’, S) 
where, intuitively, X is a space of inhabitants of the type, X’ is a space serving 
as a chosen bundle of tangents over X, and S C XP x X’® is a binary relation 
between curves, informally relating curves in X with their tangent curves in X’. 
This new model gives a denotational semantics for automatic differentiation. 

In §3 we boil this new approach down to a straightforward and elementary 
logical relations argument for the correctness of automatic differentiation. The 
approach is explained in detail in 85. 


Related work and context. AD has a long history and has many implemen- 
tations. AD was perhaps first phrased in a functional setting in [26], and there 
are now a number of teams working on AD in the functional setting (e.g. [34, 
31, 12]), some providing efficient implementations. Although that work does not 
involve formal semantics, it is inspired by intuitions from differential geometry 
and category theory. 

This paper adds to a very recent body of work on verified automatic differen- 
tiation. Much of this is concurrent with and independent from the work in this 
article. In the first order setting, there are recent accounts based on denotational 
semantics in manifolds [13] and based on synthetic differential geometry [9], as 
well as work making a categorical abstraction [8] and work connecting oper- 
ational semantics with denotational semantics [2,28]. Recently there has also 
been significant progress at higher types. The work of Brunel et al. gives formal 
correctness proofs for reverse-mode derivatives on computation graphs [5]. The 
work of Barthe et al. [4] provides a general discussion of some new syntactic 
logical relations arguments including one very similar to our syntactic proof of 
Theorem 1. We understand that the authors of [9] are working on higher types. 

The differential \-calculus [11] is related to AD, and explicit connections are 
made in [22, 23]. One difference is that the differential -calculus allows addition 
of terms at all types, and hence vector space models are suitable, but this appears 
peculiar with the variant and inductive types that we consider here. 

Finally we emphasise that we have chosen the neural network (1) as our 
running example mainly for its simplicity. There are many other examples of AD 
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outside the neural networks literature: AD is useful whenever derivatives need to 
be calculated on high dimensional spaces. This includes optimization problems 
more generally, where the derivative is passed to a gradient descent method 
(e.g. [30, 18, 29, 19, 10, 21]). Other applications of AD are in advanced integration 
methods, since derivatives play a role in Hamiltonian Monte Carlo [25,14] and 
variational inference [20]. 


Summary of contributions. We have provided a semantic analysis of auto- 
matic differentiation. Our syntactic starting point is a well-known forward-mode 
AD macro on a typed higher order language (e.g. [31, 34]). We recall this in §2 
for function types, and in $4 we extend it to inductive types and variants. The 
main contributions of this paper are as follows. 


— We give a denotational semantics for the language in diffeological spaces, 
showing that every definable expression is smooth (§3). 

— We show correctness of the AD macro by a logical relations argument (Th. 1). 

— We give a categorical analysis of this correctness argument with two parts: 
canonicity of the macro in terms of syntactic categories, and a new notion 
of glued space that abstracts the logical relation (§5). 

— We then use this analysis to state and prove a correctness argument at all 
first order types (Th. 2). 

— We show that our method is not specific to one particular AD macro, by 
also considering a continuation-based AD method (§6). 


2 A simple forward-mode AD translation 


Rudiments of differentiation and dual numbers. Recall that the derivative 
of a function f : R > R, if it exists, is a function Vf : R —> R such that 
Vfl) = afte) (xo) is the gradient of f at xo. 
To find Vf in a compositional way, two generalizations are reasonable: 
— We need both f and Vf when calculating V(f;g) of a composition f; g, using 
the chain rule, so we are really interested in the pair (f, Vf): R— R x R; 
— In building f we will need to consider functions of multiple arguments, such 
as + : R? —> R, and these functions should propagate derivatives. 
Thus we are more generally interested in transforming a function g : R” —> R 
into a function h : (Rx R)” > R xR in such a way that for any fi... fn : R > R, 


(Jis V Jirsa Fay Vinee = (fines da) 0 Vigan, Fn) (2) 


An intuition for h is often given in terms of dual numbers. The transformed 
function operates on pairs of numbers, (x, x’), and it is common to think of such 
a pair as x + v'e for an ‘infinitesimal’ e. But while this is a helpful intuition, 
the formalization of infinitesimals can be intricate, and the development in this 
paper is focussed on the elementary formulation in (2). 


The reader may also notice that h encodes all the partial derivatives of g. 
For example, if g: R? > R, then with f, (x) af > and falx) = z2, by apply- 


Bat 22) (a4 )) and similarly 


ing (2) to xı we obtain h(a,1,2%2,0) = (g(£1, 22), 
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h(#1,0, £2, 1) = (g(a, £2), Osen) (gy), And conversely, if g is differentiable in 
each argument, then a unique h satisfying (2) can be found by taking linear 
combinations of partial derivatives: 


h(xr, 24, 02,24) = (g(w1, a2), 04 - 282) (01) + rh- PI) (poy), 


In summary, the idea of differentiation with dual numbers is to transform a 
differentiable function g : R” — R to a function h : R?” — R? which captures g 
and all its partial derivatives. We packaged this up in (2) as a sort-of invariant 
which is useful for building derivatives of compound functions R —> R in a 
compositional way. The idea of forward mode automatic differentiation is to 
perform this transformation at the source code level. 


A simple language of smooth functions. We consider a standard higher 
order typed language with a first order type real of real numbers. The types 
(7,0) and terms (t, s) are as follows. 


Top = types | (T1*...*Tp) finite product 
| real real numbers | Too function 
t s, r = terms 
x variable 
| c| t+s | txs | s(t) operations/constants 
| (ti... tn} | casetof (z1,..., £n) 4s tuples/pattern matching 
| Azt |ts function abstraction /app. 


The typing rules are in Figure 3. We have included a minimal set of operations 
for the sake of illustration, but it is not difficult to add further operations. We 


add some simple syntactic sugar t — u EET. (—1) x u. We intend ç to stand 
for the sigmoid function, s(x) d ss We further include syntactic sugar 
let x = tins for (Av.s)t and A(a1,..., Ln). for Av.case x of (z1,..., Zn) > t. 


Syntactic automatic differentiation: a functorial macro. The aim of for- 
ward mode AD is to find the dual numbers representation of a function by 
syntactic manipulations. For our simple language, we implement this as the fol- 
lowing inductively defined macro B on both types and terms (see also [34, 31]): 
B (real) “ (realxreal) B(T 30) © B(T) 3 Blo) 


B((1#-:*Tm)) = (B (T1)* -+B (Ta)) 


( R) IFt:real I’ s: real [It t:real ['} s:real It: real 

sE 

IF c: real Ir Ht+s:real Th txs:real T F- c(t): real 
PRR SH ae E Fiat Ta Pt: (o1%...*0n) D,t1:01,...,2n: nb 8:7 
TE Gi csda) (aes ey) I t casetof (x1,...,0n) > S:T 

Fizat ttio TKt:a7T Preece 
— (z: ED = 
FERT ThA: tr.t:T OO TKts:7 


Fig. 3. Typing rules for the simple language. 
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2) Se B(c) = (c0) 
+s) = case BD (t) of (x, x’) + case D (s) of (y, y’) > (£ +y, £! +y/’) 
* s) = case B(t) of (x, 2’) > case B(s) of f (y, vi> a 


Azt) S dx.B(t) Blts) = Be) B(s) Btn ta) E Blade. oB (tn)) 

case tof (z1,..., Zn) > 8) f case D (t) of (x1,..., £n) > B(s) 

We extend B to contexts: B({ay:71, ..., En:Tn}) ae {1:3 (11), .--)%n:D (Tr)}- 
This turns B into a well-typed, functorial macro in the following sense. 
Lemma 1 (Functorial macro). If I} t:r then B(I) + Bit t): B(T). 
IfT,x:oHt:r andT E 8:0 then B(T) E B (t|/]) = BAPE vel, 


3 Bt BE St St 
T 


Example 1 (Inner products). Let us write 7” for the n-fold product (T*...*T). 
Then, given I F t,s: real” we can define their inner product 


Tetns® caset of (z1,..., Zn) > 

case sof (y1,..., Yn) > 21* Y1 +- + 2n * Yn : real 
To illustrate the calculation of B, let us expand (and 8-reduce) B (t -> s): 
case BD (t) of (21, z2) > case B(s) of (y1, y2) 3 case 21 of (21,1, 21,2) > 
case yi of (yia, Y1,2) — case z2 of (22.1, 22,2) — Case yo of (Yat; Y2,2) =} 
(21,1 * Y1,1 + 22,1 * Yas 21,1 * Y1,2 + 21,2 * Y1,1 + 22,1 * Y2,2 + 22,2 * Y2,1) 


Example 2 (Neural networks). In our introduction (1), we provided a program 
in our language to build a neural network out of expressions neuron, layer, comp; 
this program makes use of the inner product of Ex. 1. We can similarly calculate 
= è š 

D of such deep neural nets by mechanically applying the macro. 


3 Semantics of differentiation 


Consider for a moment the first order fragment of the language in § 2, with only 
one type, real, and no A’s or pairs. This has a simple semantics in the category 
of cartesian spaces and smooth maps. Indeed, a term z1... £n : real t : real 
has a natural reading as a function [|t] : R” > R by interpreting our operation 
symbols by the well-known operations on R” — R with the corresponding name. 
In fact, the functions that are definable in this first order fragment are smooth, 
which means that they are continuous, differentiable, and their derivatives are 
continuous, differentiable, and so on. Let us write CartSp for this category of 
cartesian spaces (R” for some n) and smooth functions. 

The category CartSp has cartesian products, and so we can also interpret 
product types, tupling and pattern matching, giving us a useful syntax for con- 
structing functions into and out of products of R. For example, the interpretation 
of (neuron,,) in (1) becomes 


R” x R” x Ree pR x ple E, p. 


where [-n], [+] and [s] are the usual inner product, addition and the sigmoid 
function on R, respectively. 
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Inside this category, we can straightforwardly study the first order language 
without \’s, and automatic differentiation. In fact, we can prove the following 
by plain induction on the syntax: 

The interpretation of the (syntactic) forward AD B(t) of a first-order term t 
equals the usual (semantic) derivative of the interpretation of t as a smooth 
function. 

However, as is well known, the category CartSp does not support function 
spaces. To see this, notice that we have polynomial terms 

1,.--,€q: real F Ay. ae tny" : real > real 
for each d, and so if we could interpret (real —> real) as a Euclidean space R? 
then, by interpreting these polynomial expressions, we would be able to find 
continuous injections R? — R? for every d, which is topologically impossible 
for any p, for example as a consequence of the Borsuk-Ulam theorem (see [15], 
Appx. A). 

This means that we cannot interpret the functions (layer) and (comp) from (1) 
in CartSp, as they are higher order functions, even though they are very use- 
ful and innocent building blocks for differential programming! Clearly, we could 
define neural nets such as (1) directly as smooth functions without any higher 
order subcomponents, though that would quickly become cumbersome for deep 
networks. A problematic consequence of the lack of a semantics for higher order 
differential programs is that we have no obvious way of establishing composi- 
tional semantic correctness of B for the given implementation of (1). 


Diffeological spaces. This motivates us to turn to a more general notion 
of differential geometry for our semantics, based on diffeological spaces [16]. 
The key idea will be that a higher order function is called smooth if it sends 
smooth functions to smooth functions, meaning that we can never use it to 
build first order functions that are not smooth. For example, (comp) in (1) has 
this property. 


Definition 1. A diffeological space (X,Px) consists of a set X together with, 
for each n and each open subset U of R”, a set PY C [U — X] of functions, 
called plots, such that 

— all constant functions are plots; 
— if f: VU is a smooth function and p € PY, then f;p € Py; 


— if (pi € Py) 1 is a compatible family of plots (x € U;NU; > pi(x) = p;(«)) 
ic 
and (U;),<; covers U, then the gluing p : U > X : x € U; œ p,(2) is a plot. 


We call a function f : X — Y between diffeological spaces smooth if, for all plots 
p € PX, we have that p; f € PY. We write Diff (X,Y) for the set of smooth 
maps from X to Y. Smooth functions compose, and so we have a category Diff 
of diffeological spaces and smooth functions. 

A diffeological space is thus a set equipped with structure. Many construc- 
tions of sets carry over straightforwardly to diffeological spaces. 


Example 3 (Cartesian diffeologies). Each open subset U of R” can be given the 
structure of a diffeological space by taking all the smooth functions V => U 
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as PE . It is easily seen that smooth functions from V — U in the traditional 
sense coincide with smooth functions in the sense of diffeological spaces. Thus 
diffeological spaces have a profound relationship with ordinary calculus. 


In categorical terms, this gives a full embedding of CartSp in Diff. 


Example 4 (Product diffeologies). Given a family (X;);cz of diffeological spaces, 
we can equip the product [],<; Xi of sets with the product diffeology in which 
U-plots are precisely the functions of the form (p;);er for pj € PẸ.. 


This gives us the categorical product in Diff. 


Example 5 (Functional diffeology). We can equip the set Diff (X,Y) of smooth 
functions between diffeological spaces with the functional diffeology in which U- 
plots consist of functions f : U > Diff(X,Y) such that (u, x) + f(u)(x) is an 
element of Diff(U x X,Y). 


This specifies the categorical function object in Diff. 


Semantics and correctness of AD. We can now give a denotational seman- 
tics to our language from § 2. We interpret each type T as a set [r] equipped 
with the relevant diffeology, by induction on the structure of types: 
def def def y 
[real] © R IGENEA] = Ilr] [7 > o] € Diff ((r], lel) 

A context I’ = (£1: 7)...%n: Tn) is interpreted as a diffeological space [I] = 
TE. [i]. Now well typed terms I’ F t: r are interpreted as smooth functions 
lt] : [2] > [r], giving a meaning for t for every valuation of the context. This is 
routinely defined by induction on the structure of typing derivations. Constants 
c : real are interpreted as constant functions; and the first order operations 


(+,*,¢) are interpreted by composing with the corresponding functions, which 


are smooth. For example, [s(t)](p) = <([t](p)), where p € [I]. Variables are 


interpreted as [x;](p) 2 pi. The remaining constructs are interpreted as follows, 
and it is straightforward to show that smoothness is preserved. 


def def 
[(t1,---,tn)](o) = (alo), -o [nd ())  [Av:7-t](o)(@) = itla) (a € [r]) 
def def 
[case t of (...) > s](o) = Islo [l(o)) isli) = EEs) 

Notice that aterm zı : real,...,2,,: real} t : real is interpreted as a smooth 
function |t] : R” > R, even if t involves higher order functions (like (1)). More- 
over the macro differentiation P (t) is a function [B(t)] : (R x R)” > (R x R). 
This enables us to state a limited version of our main correctness theorem: 


Theorem 1 (Semantic correctness of D (limited)). For any term 
zı: real,..., £n: reall t: real, the function [DB (t)] is the dual numbers repre- 
sentation (2) of [t]. In detail: for any smooth functions fi... fn: R > R, 


(fe Vfis= ifn V Jn); BO] = Ceres ltl, Vf endri) [¢])). 


(For instance, if n = 2, then [B(t)] (1,1, 22,0) = ([é] (a1, £2), AUG #2) (-,)).) 
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Proof. We prove this by logical relations. Although the following proof is ele- 
mentary, we found it by using the categorical methods in § 5. 
For each type 7, we define a binary relation S, between curves in [7] and 


curves in [B(r)], ie. S- C PÈ x F j by induction on T: 
def 


— Srea = {(f,(f, Vf)) | f : R > R smooth}; 
-= Seso) = (r a), (fas 92)) | (ft J2) € Sr (91,92) € Sots 


def 


~ Sro = {(fi; f2) | Y(91; 92) € Sr.(a fı (2) (g1 (2)), a f2(x)(g2(2))) E€ So}. 
Then, we establish the following ‘fundamental lemma’: 


If @4:71,...,2n:T™m F t : o and, for all 1<i<n, Y1..-Ym : real F si : 7; 
. = 

is such that ((f1,.--, fm); [si], (f1, V fi), --) fm; V fm); [D (sa) ]) € S7; for 
all smooth fi : R —> R, then 


(Fis -o fred) Eers see oa ll, Gy V fis -o Fm V fm); IB oie D) 
is in S, for all smooth f; : R > R. 


This is proved routinely by induction on the typing derivation of t. The case 
for * relies on the precise definition of Bit * s), and similarly for +, ¢. 

We conclude the theorem from the fundamental lemma by considering the 
case where 7; = o = real, m = n and s; = yj. 


4 Extending the language: variant and inductive types 


In this section, we show that the definition of forward AD and the semantics 
generalize if we extend the language of §2 with variants and inductive types. As 
an example of inductive types, we consider lists. This specific choice is only for 
expository purposes and the whole development works at the level of generality 
of arbitrary algebraic data types generated as initial algebras of (polynomial) 
type constructors formed by finite products and variants. 

Similarly, our choice of operations is for expository purposes. More generally, 
assume given a family of operations (Op,,)nen indexed by their arity n. Further 
assume that each op € Op,, has type real” — real. We then ask for a certain 
closure of these operations under differentiation, that is we define 


B(op(ti,...,tn)) = case B(t) of (x1,2/,) 3... > case B (tn) of (an, 2/,) > 
COD tiga <av aly X i1 T; * iop(z1,..-,En)) 

where 0;0p(#1,...,2n) is some chosen term in the language, involving free vari- 

ables from z1,..., £n, which we think of as implementing the partial derivative 


of op with respect to its i-th argument. For constructing the semantics, every op 
must be interpreted by some smooth function, and, to establish correctness, the 
semantics of ĝ;op(z1,..., £n) must be the semantic i-th partial derivative of the 
semantics of op(x1,...,2n). 


Language. We additionally consider the following types and terms: 
T,0,p == types | list(7) list 
| {Am | vss | Li Fact variant 
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ts = terms 

| ret variant constructor 

| [] | tus empty list and cons 

| casetof {0,271 > sı | vee | Ln Ln > Sn} pattern matching: variants 
| fold (x1, 72).t over s fromr list fold 

We extend the type system according to: 
Trt:% TPt I'l s: list(r) 
TEEF. (Gri) € 7) — EET 
re a ae IF []: list(r) Tht: 8: list(r) 
Prt: {7 | RE for each 1 <i < n: T, £i: 74 83:7 
I H casetof {4 z1 > sı | vee | Ly Bn > Sn} iT 


IF s: list(r) rFr:o Dtr im mno Firg 


I F fold (x1, £2).t over s fromr : o 
We can then extend B to our new types and terms by 
BUAT |... | fm} E {4 B(n)|...|aB(m)} Blist(r)) = list(B(7)) 
B(T Lt) = B(r) LB (t) oes B (t = s) Z B(t) : B(s) 
P (case t of {h z1 > sı -| Ln Xn > Sn}) = = 
case D (t) of {4 £1 > Ea | ++ | fan > B(Sn)} 


B (fold (x1, £2).t over s from r) £ fold (£1, £2).B (t) over B (s) from B (r) 


To demonstrate the practical use of expressive type systems for differential 
programming, we consider the following two examples. 


Example 6 (Lists of inputs for neural nets). Usually, we run a neural network on 
a large data set, the size of which might be determined at runtime. To evaluate 
a neural network on multiple inputs, in practice, one often sums the outcomes. 
This can be coded in our extended language as follows. Suppose that we have 
a network f : (real”*P) — real that operates on single input vectors. We can 
construct one that operates on lists of inputs as follows: 


g = A(l, w) fold (z1, 72). f (£1, wW) + x2 over l from 0 : (list(real”)*P) — real 


Example 7 (Missing data). In practically every application of statistics and ma- 
chine learning, we face the problem of missing data: for some observations, only 
partial information is available. In an expressive typed programming language 
like we consider, we can model missing data conveniently using the data type 
maybe(r) = {Nothing () | Just 7}. In the context of a neural network, one might 
use it as follows. First, define some helper functions 


fromMaybe © \x.\m.case m of {Nothing _ > x | Just x” > x'} 


n def 


fromMaybe” = X(x1,...,2n)-A(M1, ..., Mn). (fromMaybe zı m1, ..., fromMaybe £n Mn) 


: (maybe(real))” — real” > real” 


map  \f.Al.fold (x1, £2).f 21 1: 22 over | from |] : (tT + ø) > list(r) > list(c) 
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Given a neural network f : (list(real”)*P) — real, we can build a new one 
that operates on on a data set for which some covariates (features) are missing, 
by passing in default values to replace the missing covariates: 


X(L, (m, w)).f (map (fromMaybe* m) 1, w) 
: (list ((maybe(real))")*(real**P)) > real 


Then, given a data set l with missing covariates, we can perform automatic 
differentiation on this network to optimize, simultaneously, the ordinary network 
parameters w and the default values for missing covariates m. 


Semantics. In § 3 we gave a denotational semantics for the simple language in 
diffeological spaces. This extends to the language in this section, as follows. As 
before, each type 7 is interpreted as a diffeological space, which is a set equipped 
with a family of plots: 

— A variant type {4 T1 | ae | én Tn} is inductively interpreted as the disjoint 
union [{41 71 | e | ln Tr} I] Hi [r] with U-plots 


n 


def fi n n Uj 
pY i v; —> [75] = dalal | U = Wi- Uj, fj € rt} 
j=l 


Hei rifon tH 


— A list type list(7) is interpreted as the set of lists, [list(7)] q We iT 
with U-plots 


co 
Pise = ives] |U= es Her, \ 

j=1 
The constructors and destructors for ist anes and lists are interpreted as in 
the usual set theoretic semantics. It is routine to show inductively that these 
interpretations are smooth. Thus every term I F t: 7 in the extended language 
is interpreted as a smooth function [t] : [I] —> [7] between diffeological spaces. 

(In this section we focused on a language with lists, but other inductive types 

are easily interpreted in the category of diffeological spaces in much the same 
way; the categorically minded reader may regard this as a consequence of Diff 
being a concrete Grothendieck quasitopos, e.g. [3].) 


5 Categorical analysis of forward AD and its correctness 


This section has three parts. First, we give a categorical account of the functo- 
riality of AD (Ex. 8). Then we introduce our gluing construction, and relate it 
to the correctness of AD (dgm. (3)). Finally, we state and prove a correctness 
theorem for all first order types by considering a category of manifolds (Th. 2). 


Syntactic categories. Our language induces a syntactic category as follows. 


Definition 2. Let Syn be the category whose objects are types, and where a 
morphism T + o is a term in contert x: T} t:0 modulo the By-laws (Fig. 4). 
Composition is by substitution. 
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For simplicity, we do not impose arithmetic identities such as z + y = y + x in 
Syn. As is standard, this category has the following universal property. 


Lemma 2 (e.g. [27]). For every bicartesian closed category C with list objects, 
and every object F (real) € C and morphisms F (c) € C(1, F(real)), F(+), F(*) € 
C(F (real) x F(real), F(real)), F(s) € Syn(F (real), F(real)) in C, there is a 
unique functor F : Syn —> C respecting the interpretation and preserving the 
bicartesian closed structure as well as list objects. 


Proof (notes). The functor F : Syn > C is a canonical denotational semantics 


for the language, interpreting types as objects of C and terms as morphisms. 


For instance, F(T > o) (Fr — Fo), the function space in the category C, 


and F(t s) dct is the composite (Ft, F's); eval. When C = Diff, the denotational 
semantics of the language in diffeological spaces (§3,4) can be understood as the 
unique structure preserving functor [—] : Syn —> Diff satisfying [real] = R, 
[s] = s and so on. 


Example 8 (Canonical definition forward AD). The forward AD macro B (§2,4) 
arises as a canonical cartesian closed functor on Syn. Consider the unique carte- 
sian closed functor F : Syn —> Syn such that F (real) = realxreal, F(c) = B(c), 
F(s) = B(<(a)), and 

F(+) = z : F(real)*F (real) + case z of (x,y) > B (x + y) : F(real) 

F(x) = z : F(real)*F (real) + case z of (x,y) > B(x x y) : F(real) 

Then for any type 7, F(T) = B(T), and for any term x: TF t: 0, F(t) = B(t) 
as morphisms F(T) + F(o) in the syntactic category. 


Categorical gluing and logical relations. Gluing is a method for building 
new categorical models which has been used for many purposes, including logical 
relations and realizability [24]. Our logical relations argument in the proof of 
Th. 1 can be understood in this setting. In this subsection, for the categorically 
minded, we explain this, and in doing so we quickly recover a correctness result 
for the more general language in § 4 and for arbitrary first order types. 

We define a category Gly whose objects are triples (X, X’, S) where X 
and X’ are diffeological spaces and S C PẸ x PY, is a relation between their 
U-plots. A morphism (X,X’,S) — (Y,Y’,T) is a pair of smooth functions 


case (t1,...,tn) of (a1,...,2n) > 8 = 8[fe,--- 5” fon] 

sf] #7" case t of (x1,-..,2n) > s[(*™)/,] (Ax.t) s = tf] 

case (jt of {6h z1 > sı | +++ | lEn > sn} = 5i["/o,] tË Ante 

s/a] #2 250 case t of {4 xı > s[*/y] | e | batna > sf?" /y]} 

fold (x1, x2).t over [] fromr =r We write #27" to indi- 
M ps1; fold (#1,22).t over sz from r cate that the variables are 

fold (x1, x2).t over sı :: s2 fromr = t[""/z, feal free in the left hand side. 

u =A], r[feo] = s] > sE] FL fold (x1, @2).r over t from u 


Fig. 4. Standard {7-laws (e.g. [27]) for products, functions, variants and lists. 
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f:X > Y, f'l: X’ > Y', such that if (g,g') E€ S then (g; f,9'; f) € T. The 
idea is that this is a semantic domain where we can simultaneously interpret the 
language and its automatic derivatives. 


Proposition 1. The category Gly is bicartesian closed, has list objects, and the 
projection functor proj : Gly —> Diff x Diff preserves this structure. 


Proof (notes). The category Gly is a full subcategory of the comma category 
idset | Diff (U, —) x Diff (U, —). The result thus follows by the general theory 
of categorical gluing (e.g. [17, Lemma 15]). 


We give a semantics (—) = ((—)o, (—)1, S_) for the language in Glg, interpreting 


def y 
2 


wa 


types T as objects ((7)o, (7)1, S+), and terms as morphisms. We let (real) 
and (real), “2 R2, with the relation Srea = {(F, (f, V F)) | f : R > R smooth}. 
We interpret the constants c as pairs (c)o cand (da qf (c, 0), and we interpret 
+, x,ç in the standard way (meaning, like [—]) in (—)o, but according to the 
derivatives in (—)1, for instance, (*)ı : R? x R? > R? is 


d)i((x,2"), (yy) © (ay, ay! +2"y). 


At this point one checks that these interpretations are indeed morphisms in 
Glg. This amounts to checking that these interpretations are dual numbers 
representations in the sense of (2). The remaining constructions of the language 
are interpreted using the categorical structure of Glg, following Lem. 2. 

Notice that the diagram below commutes. One can check this by hand or 
note that it follows from the initiality of Syn (Lem. 2): all the functors preserve 
all the structure. 


id, B(— 
oyn MT ayn Sm (3) 
m| [ira 
Gly __» Diff x Diff 


We thus arrive at a restatement of the correctness theorem (Th. 1), which holds 
even for the extended language with variants and lists, because for any £1...£n : 
real | t : real, the interpretations ([t], [Ð (t)]) are in the image of the projection 
Glg > Diff x Diff, and hence [B(t)] is a dual numbers encoding of [t]. 


Correctness at all first order types, via manifolds. We now generalize 
Theorem 1 to hold at all first order types, not just the reals. To do this, we 
need to define the derivative of a smooth map between the interpretations of 
first order types. We do this by recalling the well known theory of manifolds and 
tangent bundles. 

For our purposes, a smooth manifold M is a second-countable Hausdorff topo- 
logical space together with a smooth atlas: an open cover U together with home- 


omorphisms (¢y : U > R")) cy (called charts) such that $g; v is smooth 
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on its domain of definition for all U,V € U. A function f : M — N between 
manifolds is smooth if $y; f;wvv is smooth for all charts ġy and wy of M and 
N, respectively. Let us write Man for this category. 

Our manifolds are slightly unusual because different charts in an atlas may 
have different finite dimension n(U). Thus we consider manifolds with dimensions 
that are potentially unbounded, albeit locally finite. This does not affect the 
theory of differential geometry as far as we need it here. 

Each open subset of R” can be regarded as a manifold. This lets us regard the 
category of manifolds Man as a full subcategory of the category of diffeological 
spaces. We consider a manifold (X, {9y }u) as a diffeological space with the same 
carrier set X and where the plots PY are the smooth functions in Man(U, X). 
A function X —> Y is smooth in the sense of manifolds if and only if it is smooth 
in the sense of diffeological spaces [16]. For the categorically minded reader, this 
means that we have a full embedding of Man into Diff. Moreover, the natural 
interpretation of the first order fragment of our language in Man coincides with 
that in Diff. That is, the embedding of Man into Diff preserves finite products 
and countable coproducts (hence initial algebras of polynomial endofunctors). 


Proposition 2. Suppose that a type T is first order, i.e. it is just built from 
reals, products, variants, and lists (or, again, arbitrary inductive types), and not 
function types. Then the diffeological space |r] is a manifold. 


Proof (notes). This is proved by induction on the structure of types. In fact, one 
may show that every such [r] is isomorphic to a manifold of the form Hi; R® 
where the bound n is either finite or oo, but this isomorphism is typically not 
an identity function. 


The constraint to first order types is necessary because, e.g. the space [real > 
real] is not a manifold, because of a Borsuk-Ulam argument (see [15], Appx. A). 

We recall that the derivative of any morphism f : M — N of manifolds is 
given as follows. For each point x in a manifold M, define the tangent space 
TM to be the set {y € Man(R, M) | 7(0) = x}/ ~ of equivalence classes |y] of 
smooth curves y in M based at x, where we identify y1 ~ y2 iff Vi; f)(0) = 


V (92; f)(0) for all smooth f : M — R. The tangent bundle of M is the set 


T(M) es rcm Te(M). The charts of M equip 7 (M) with a canonical manifold 


structure. Then for smooth f : M —> N, the derivative T (f): 7(M) > T(N) 


is defined as 7 (f)(x, [7]) es (f(x), [y; f]). All told, the derivative is a functor 


T : Man > Man. 
As is standard, we can understand the tangent bundle of a composite space 
in terms of that of its parts. 


Lemma 3. There are canonical isomorphisms T (472; Mi) S We, T (Mi) and 
TM x... Xx Mn) ST OG) x... x T (Mn). 


— 
We define a canonical isomorphism ¢?7 : [B(r)]] > T([r]) for every type 7, 
—> 
by induction on the structure of types. We let 62.7, : [B (real)] + T([reall]) be 


real 
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given by PT (a, 2’) = (x, [t > x + x't]). For the other types, we use Lemma 3. 


We can now phrase correctness at all first order types. 


Theorem 2 (Semantic correctness of D (full)). For any ground 7, any first 
order context I and any term I F t : T, the syntactic translation B coincides 
with the tangent bundle functor, modulo these canonical isomorphisms: 


ae BO 


TD) 


SYL 
4 
g 
Ik 
IR 


TED 


Proof (notes). For any curve y € Man(R, M), let y € Man(R,7(M)) be the 
tangent curve, given by 7(x) = (y(x), [t => y(x + t)]). First, we note that a 
smooth map h : T(M) —> T(N) is of the form T (g) for some g : M > N if 
for all smooth curves y : R > M we have 7;h = (739) : R > T(N). This 
generalizes (2). Second, for any first order type T, Sfr} = {(f, f) | fi PT = fy. 
This is shown by induction on the structure of types. We conclude the theorem 
from diagram (3), by putting these two observations together. 


6 A continuation-based AD algorithm 


We now illustrate the flexibility of our framework by briefly describing an alter- 
native syntactic translation D p- This alternative translation uses aspects of con- 
tinuation passing style, inspired by recent developments in reverse mode AD [34, 
5]. In brief, D p works by D p(real) = (real « (real — p)). Thus instead of using a 
pair of a number and its tangent, we use a pair of a number and a continuation. 
The answer type p = real” needs to have the structure of a vector space, and 
the continuations that we consider will turn out to be linear maps. Because we 
work in continuation passing style, the chain rule is applied contravariantly. If 
the reader is familiar with reverse-mode AD algorithms, they may think of the 
dimension k as the number of memory cells used to store the result. 

Computing the whole gradient of a term zı : real,...,7, : real F t : real at 
once is then achieved by running D;,(t) on a k-tuple of basis vectors for real". 

We define the continuation-based AD macro D, on types and terms as the 
unique structure preserving functor Syn > Syn with D, (real) = (realx(real > 
real")) and 


Dele) = (c,r2.(0,...,0)) 
Delt +s) i case 5, (0) of (x, x’) + case D;(s) of (y, y’) > (x +y,Az.a’ z + y' z) 


Di(t s) S case D(t) of (x, x’) + case D;(s) of (y, y') > 
(x x y, Az.2' (yx z) +y! (x * z)) 
D,(c(t)) = case D,(t) of (x, 2’) > let y = s(x) in (y, Xz.2! (y * (1 — y) *2)). 


: def 
Here, we use sugar x : real*,y : realë + x+y = casexof (£1, Ek) > 
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case y of (y1,.-., Yk) > (£1 +Y1,---, £k +yYk). (We could easily expand this def- 
inition by making D p preserve all other term and type formers, as we did for B ) 
Note that the corresponding scheme for an arbitrary n-ary operation op would 
be (c.f. the scheme for forward AD in §4) 


D,(op(ti,...,tn)) & case Dy(t1) of (x1, 2',) 3... + case Dy (tp) of (tp, 2',) > 
(OD (Pik ss <q En), ASD, He (Ooplti,--44 En) * 2) 

The idea is that D,(t) is a higher order function that simultaneously computes 

t (the forward pass) and defines as a continuation the reverse pass which com- 

putes the gradient. In order to actually run the algorithm, we need two auxiliary 

definitions 


def 
lamR*..) = Az. case z of (x, 2’) > casez’ of (x4, ..., £4) > 


(x, Ay(x', * y, ..-, 0, *y)) : Be(real) > D;,(real) 
evRe sal i \z. case z of (x, 2’) > (x, x' 1): D, (real) > By (real). 
Here, By is a macro on types (and terms) with exactly the same inductive def- 
inition as B except for the base case DB, (real) = (realxreal”). By noting that 
both B, and D, preserve all type formers, we can extend these definitions to all 
first order types T: z : By(r) F lamR*(z) : Dg(r), z : De(r) E evRE(z) : By(r). 
We can think of lamR*(z) as encoding k tangent vectors z : D(t) as a closure, 
so it is suitable for running D;(t) on, and evR*(z) as actually evaluating the 
reverse pass defined by z : Dz (7) and returning the result as k tangent vectors. 
The idea is that given some x: T F t: o between first order types 7,0, we run 
our continuation-based AD by running evRÉ (DLA PERO]. 
The correctness proof closely follows that for forward AD. In particular, 


one defines a binary logical relation (real)"* = (R,R x (Reto. where 


Se = {62 (Fæ) y = Ofl@) +4... Af(@) *y)) | F EPR}, on the 
plots PR” x PR’ guyz) and verifies that [e] x [P(o], [e + yl x Drle + y), 
[xxy] x [De (axy)] and [s(x)] x [D%(s(x))] respect this logical relation. It follows 
that this relation extends to a functor (—)"* : Syn > Glg such that id x Dy 


factors over (—)"*, implying the correctness of the continuation-based AD by 
the following lemma. 


Lemma 4. For all first order types T (i.e. types not involving function types), 
we have that [evR*(lamR‘(t))] = ft]. 


Proof (notes). This follows by an induction on the structure of r. The idea is 
that lamR* embeds reals into function spaces as linear maps, which is undone 
by evR* by evaluating the linear maps at 1. 


To phrase correctness, in this setting, however, we need a few definitions. 
Keeping in mind the canonical projection T(M) — M, we define T*(M) as 
the k-fold categorical pullback (fibre product) T(M) xy... Xm T(M). To be 
explicit, T, M consists of k-tuples of tangent vectors at the base point x. Again, 


T* extends to a functor Man —> Man by defining T*(f)(z, (v1,.--,vx)) = 
(f(x), (Tx (f)(v1),---;Te(f)(ve))). As TË preserves countable coproducts and 
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B 


canonical isomorphisms eo, T : (B,.(r)] 3 T*([7]) for first order types r. This 
leads to the following correctness statement for continuation-based AD. 


finite products (like J), it follows that the isomorphisms ¢?7 generalize to 
Br 


Theorem 3 (Semantic correctness of Dr) For any ground T, any first order 
context I and any term I + t : T, syntactic translation t + evRË (D,(t)[am8r@/_) 
coincides with the tangent bundle functor, modulo these canonical isomorphisms: 


— 
[lamR%;D p (t);jevRE] 


IB:n] [P.(r)] 
PET | =| By 
TFD TI TD 


For example, when 7 = real and I = x,y : real, we can run our continuation- 
based AD to compute the gradient of a program x,y : real F t : real at values 
x = V,y = W by evaluating 


2 2 j 
evR2 o(£)[G2mRereat v), lamRy reai w) /,]) [V (1,0) y W (©) 7), 


real (D 


Indeed, 
[evR2 a1 (Da(t)[CmReren %)/,,CamR Geren) / 7) (V20), (WDD) 77) = 


(EIVI, (WI), 2: AVI, WI), A204) (IV, WD). 


7 Discussion and future work 


Summary. We have shown that diffeological spaces provide a denotational 
semantics for a higher order language with variants and inductive types (83,4). 
We have used this to show correctness of a simple AD translation (Thm. 1, 
Thm. 2). But the method is not tied to this specific translation, as we illustrated 
in Section 6. 

The structure of our elementary correctness argument for Theorem 1 is a 
typical logical relations proof. As explained in Section 5, this can equivalently 
be understood as a denotational semantics in a new kind of space obtained by 
categorical gluing. 

Overall, then, there are two logical relations at play. One is in diffeological 
spaces, which ensures that all definable functions are smooth. The other is in the 
correctness proof (equivalently in the categorical gluing), which explicitly tracks 
the derivative of each function, and tracks the syntactic AD even at higher types. 


Connection to the state of the art in AD implementation. As is common 
in denotational semantics research, we have here focused on an idealized language 
and simple translations to illustrate the main aspects of the method. There are 
a number of points where our approach is simplistic compared to the advanced 
current practice, as we now explain. 
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Representation of vectors. In our examples we have treated n-vectors as tuples 
of length n. This style of programming does not scale to large n. A better 
solution would be to use array types, following [31]. Our categorical semantics 
and correctness proofs straightforwardly extend to cover them, in a similar way 
to our treatment of lists. 


Efficient forward-mode AD. For AD to be useful, it must be fast. The syntactic 
translation P that we use is the basis of an efficient AD library [31]. However, 
numerous optimizations are needed, ranging from algebraic manipulations, to 
partial evaluations, to the use of an optimizing C compiler. A topic for future 
work would be to validate some of these manipulations using our semantics. The 
resulting implementation is performant in experiments [31]. 


Efficient reverse-mode AD. Our sketch of continuation-based AD is primarily 
intended to emphasise that our denotational approach is not tied to any specific 
translation B. Nonetheless, it is worth noting that this algorithm shares similari- 
ties with advanced reverse-mode implementations: (1) it calculates derivatives in 
a (contravariant) “reverse pass” in which derivatives of operations are evaluated 
in the reverse order compared to their order in calculating the function value; 
(2) it can be used to calculate the full gradient of a function R” — R in a single 
reverse pass (while n passes of fwd AD would be necessary). However, it lacks 
important optimizations and the continuation scales with the size of the input n 
where it should scale with the size of the output. This adds an important over- 
head, as pointed out in [26]. Speed being the main attraction of reverse-mode 
AD, its implementations tend to rely on mutable state, control operators and/or 
staging [26, 6,34,5], which we have not considered here. 


Other language features. The idealized languages that we considered so far do 
not touch on several useful language constructs. For example: the use of functions 
that are partial (such as division) or partly-smooth (such as RelU); phenomena 
such as iteration, recursion; and probabilities. There are suggestions that the 
denotational approach using diffeological spaces can be adapted to these features 
using standard categorical methods. We leave this for future work. 
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Abstract. This paper introduces deep induction, and shows that it is 
the notion of induction most appropriate to nested types and other data 
types defined over, or mutually recursively with, (other) such types. Stan- 
dard induction rules induct over only the top-level structure of data, 
leaving any data internal to the top-level structure untouched. By con- 
trast, deep induction rules induct over all of the structured data present. 
We give a grammar generating a robust class of nested types (and thus 
ADTs), and develop a fundamental theory of deep induction for them 
using their recently defined semantics as fixed points of accessible func- 
tors on locally presentable categories. We then use our theory to derive 
deep induction rules for some common ADTs and nested types, and 
show how these rules specialize to give the standard structural induction 
rules for these types. We also show how deep induction specializes to 
solve the long-standing problem of deriving principled and practically 
useful structural induction rules for bushes and other truly nested types. 
Overall, deep induction opens the way to making induction principles 
appropriate to richly structured data types available in programming 
languages and proof assistants. Agda implementations of our develop- 
ment and examples, including two extended case studies, are available. 


1 Introduction 


This paper is concerned with the problem of inductive reasoning about induc- 
tive data types that are defined over, or are defined mutually recursively with, 
(other) such data types. Examples of such deep data types include, trivially, ordi- 
nary algebraic data types (ADTs), such as list and tree types; data types, such 
as the forest type, whose recursive occurrences appear below other type con- 
structors; simple nested types, such as the type of perfect trees, whose recursive 
occurrences never appear below their own type constructors; truly! nested types, 
such as the type of bushes (also called bootstrapped heaps by Okasaki [16]), whose 
recursive occurrences do appear below their own type constructors; and GADTs. 
Proof assistants, including Coq and Agda, currently provide insufficient support 
for performing induction over deep data types. The induction rules, if any, they 
generate for such types induct over only their top-level structures, leaving any 
data internal to the top-level structure untouched. This paper develops a prin- 
ciple that, by contrast, inducts over all of the structured data present. We call 
this principle deep induction. Deep induction not only provides general support 
for solving problems that previously had only (usually quite painful and) ad 
hoc solutions, but also opens the way for incorporating automatic generation of 
useful induction rules for deep data types into proof assistants. 


1 Nested types that are defined over themselves are known as truly nested types. 


© The Author(s) 2020 
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To illustrate the difference between structural induction and deep induction, 
note that the data inside a structure of type Lista = Nil |Cons a (Lista) is 
treated monolithically (i.e., ignored) by the structural induction rule for lists: 


V(a: Set) (P : List a — Set) — P Nil —> 
(V(x: a) (xs: Lista) —- P xs —> P(Consxxs)) > V (xs : Lista) — P xs 


By contrast, the deep induction rule for lists traverses not just the outer list 
structure with a predicate P, but also each data element of that list with a 
custom predicate Q: 


V (a: Set) (P : List a — Set) (Q : a— Set) > 
PNil — (V(x: a) (xs: Lista) — Q x — P xs — P(Consxxs)) > 
V(xs : List a) > List’ Qxs — P xs 


Here, List” lifts its argument predicate Q on data of type a to a predicate on 
data of type List a asserting that Q holds for every element of its argument list. 
The structural induction rule for lists is, like that for any ADT, recovered by 
taking the custom predicate in the corresponding deep rule to be Ax. True. 

A particular advantage of deep induction is that it obviates the need to reflect 
properties as data types. For example, although the set of primes cannot be de- 
fined by an ADT, the primeness predicate Prime on the ADT of natural numbers 
can be lifted to a predicate List’ Prime characterizing lists of primes. Properties 
can then be proved for lists of primes using the following deep induction rule: 


V(P : List Nat — Set) > P Nil 
(V(x : Nat) (xs : List Nat) — Prime x — P xs — P (Cons xxs)) > 
V(xs : List Nat) > List’ Primexs — P xs 


As we’ll see in Sections 3, 4, and 5, the extra flexibility afforded by lifting predi- 
cates like Q and Prime on data internal to a structure makes it possible to derive 
useful induction principles for more complex types, such as truly nested ones. 

In each of the above examples, a predicate on the data is lifted to a predicate 
on the list. This is an example of lifting a predicate on a type in a non-recursive 
position of an ADT’s definition to the entire ADT. However, the predicate to 
be lifted can also be on the type in a recursive position of a definition — i.e., on 
the ADT being defined itself — and this ADT can appear below another type 
constructor in the definition. This is exactly the situation for the ADT Forest a, 
which appears below the type constructor List in the definition 


Foresta = FEmpty | FNodea (List (Forest a)) 
The induction rule Coq generates for forests is 


V (a: Set) (P : Forest a —> Set) — PFEmpty — 
(V(x: a) (ts: List (Forest a)) — P (FNodexts)) — V(x: Forest a) — Px 


However, this is neither the induction rule we intuitively expect, nor is it expres- 
sive enough to prove even basic properties of forests that ought to be amenable 
to inductive proof. The approach of [11,12] does give the expected rule? 


? This is equivalent to the rule as classically stated in Coq/Isabelle/HOL. 
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V (a: Set) (P : Forest a — Set) —> PFEmpty — 
(V(x: a) (ts: List (Forest a)) — (V (k < length ts) — P(ts!!k)) 
— P (FNode x ts)) — V(x: Forest a) > Px 


But to derive it, a technique based on list positions is used to propagate the 
predicate to be proved over the list of forests that is the second argument to the 
data constructor FNode. Unfortunately, this technique does not obviously extend 
to other deep data types, including the type of “generalized forests” introduced 
in Section 4.4 below, which combines smaller generalized forests into larger ones 
using a type constructor f potentially different from List. Nevertheless, replac- 
ing V (k < length ts) — P (ts!!k) in the expected rule with List^P ts, which is 
equivalent, reveals that it is nothing more than the special case for Q = Ax. True 
of the following deep induction rule for Forest a: 


V (a: Set) (P : Forest a — Set) (Q : a — Set) — PFEmpty > 
(V(x: a) (ts: List (Forest a)) — Qx > List’ Pts — P (FNode x ts)) > 
V(x: Forest a) > Forest^ Qx — Px 


When types, like Forest a and List (Forest a) above, are defined by mutual 
recursion, their (deep) induction rules are defined by mutual recursion as well. 
For example, the induction rules for the ADTs 


data Expr = Lit Nat | AddExpr Expr | If BExpr Expr Expr 
data BExpr = BLit Bool | And BExpr BExpr | Not BExpr | Equal Expr Expr 


of integer and boolean expressions are defined by mutual recursion as 


V(P : Expr — Set) (Q : BExpr — Set) — 

(n : Nat) > P (Lit n)) > 

(e1 : Expr) (e2 : Expr) — Pe1 — P e2 — P (Add e1 e2)) > 

(b : BExpr) (e1 : Expr) (e2 : Expr) > Qb — Pel — Pe2 — P (If b e1 e2)) > 
(b : Boo1). Q (BLit b)) > 

(b1 : BExpr) (b2 : BExpr) —> Qb1 — Q b2 — Q (And b1 b2)) > 

( 

( 

( 


b : BExpr) — Qb — qQ (Not b)) > 
el : Expr) (e2 : Expr) > P e1 —> P e2 — Q (Equal el e2)) — 


2 The Key Idea 


As the examples of the previous section suggest, the key to deriving deep induc- 
tion rules from (deep) data type declarations is to parameterize the induction 
rules not just over a predicate over the top-level data type being defined, but over 
predicates on the types of primitive data they contain as well. These additional 
predicates are then lifted to predicates on any internal structures containing 
these data, and the resulting predicates on these internal structures are lifted to 
predicates on any internal structures containing structures at the previous level, 
and so on, until the internal structures at all levels of the data type definition, 
including the top level, have been so processed. Satisfaction of a predicate by 
the data at one level of a structure is then conditioned upon satisfaction of the 
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appropriate predicates by all of the data at the preceding level. 

The above deep induction rules were all obtained using this technique. For 
example, the deep induction rule for lists is derived by first noting that struc- 
tures of type List a contain only data of type a, so that only one additional 
predicate parameter, which we called Q above, is needed. Then, since the only 
data structure internal to the type Lista is List a itself, Q need only be lifted 
to lists containing data of type a. This is exactly what List’ Q does. Finally, 
the deep induction rule for lists is obtained by parameterizing the standard one 
over not just P but also Q, adding the additional hypothesis Qx to its second 
antecedent, and adding the additional hypothesis List’ Qxs to its conclusion. 

The deep induction rule for forests is similarly obtained from the Coq- 
generated rule by first parameterizing over an additional predicate Q on the 
type a of data stored in the forest, then lifting P to a predicate on lists contain- 
ing data of type Forest a and Q to forests containing data of type a, and, finally, 
adding the additional hypotheses Qx and List’ Pts to its second antecedent 
and the additional hypothesis Forest” Qx to its conclusion. 

Predicate liftings such as List’ and Forest’ may either be supplied as prim- 
itives, or be generated automatically from the definitions of the types themselves, 
as described in Section 4. For container types, lifting a predicate amounts to 
traversing the container and applying the argument predicate pointwise. 

Our technique for deriving deep induction rules for ADTs, as well as its gen- 
eralization to nested types given in Section 3, is both made precise and rigorously 
justified in Section 4 using the results of [13]. This paper can thus be seen as a 
concrete application, in the specific category Fam, of the very general semantics 
developed in [13]; indeed, our induction rules are computed as the interpreta- 
tions of the syntax for nested types in Fam. A general methodology is extracted 
in Section 5. The rest of this paper can be read either as “just” describing how to 
generate deep induction rules in practice, or as also proving that our technique 
for doing so is provably correct and general. Our Agda code is at [14]. 


3 Extending to Nested Types 


Appropriately generalizing the basic technique of Section 2 derives deep induc- 
tion rules, and therefore structural induction rules, for nested types, including 
truly nested types and other deep nested types. Nested types generalize ADTs 
by allowing elements at one instance of a data type to depend on data at other 
instances of the same type so that, in effect, the entire family of instances is 
constructed simultaneously. That is, rather than defining standalone families of 
inductive types, one for each choice of types to which type constructors like List 
and Tree are applied, the type constructors for nested types define inductive 
families of types. The structural induction rule for a nested type must therefore 
account for its changing type parameters by parameterizing over an appropri- 
ately polymorphic predicate, and appropriately instantiating that predicate’s 
type argument at each application site. For example, the structural induction 
rule for the nested type 


PTree a = PLeaf a | PNode (PTree (a x a)) 
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of perfect trees is 


V(P:V(a: Set) + PTreea — Set) > 
(V (a: Set) (x: a) — Pa(PLeafx)) > 
(V (a: Set) (x: PTree (a x a)) + P (a x a) x — Pa(PNodex)) > 
V (a: Set) (x: PTreea) > Pax 


and the structural induction rule for the nested type 


data Lama where 
Var :: a— Lama 
App :: Lama — Lama— Lama 
Abs :: Lam (Maybe a) —> Lama 


of de Bruijn encoded lambda terms [9] with variables of type a is 


V(P: V(a: Set) — Lama — Set) > 
(V(a : Set) (x : a) — Pa(Varx)) —> 
(V(a : Set) (x : Lama) (y : Lama) — Pax — Pay — Pa(Appxy)) > 
(V(a : Set) (x : Lam (Maybe a)) — P (Maybe a) x — Pa(Absx)) > 
V(a: Set) (x : Lama) > Pax 


Deep induction rules for nested types must similarly account for their type con- 
structors’ changing type parameters while also parameterizing over the addi- 
tional predicate on the type of data they contain. Letting Pair’ Q be the lifting 
of a predicate Q on a to pairs of type a x a, so that Pair’ Q(x,y)=Qx x Qy, 
this gives the deep induction rule 


V(P:V(a: Set) — (a— Set) — PTreea — Set) — 
(V (a: Set) (Q : a — Set) (x: a) > Qx — PaQ(PLeaf x)) > 
(V (a: Set) (Q : a — Set) (x : PTree (a x a)) > P (a x a) (Pair’ Q) x > 
PaQ(PNodex)) > 
V (a: Set) (Q : a — Set) (x: PTreea) — PTree^ Qx — PaQx 


for perfect trees, and the deep induction rule 


V(P : V(a: Set) — (a — Set) — Lama — Set) 
(V(a: Set) (Q : a — Set) (x: a) > Qx —> PaQ(Varx)) > 
(V(a: Set) (Q : a — Set) (x : Lama) (y : Lama) —> PaQx — PaQy > 
PaQ(Appxy)) > 
(V(a: Set) (Q : a — Set) (x : Lam (Maybe a)) — P (Maybe a) (Maybe^ Q) x > 
PaQ(Absx)) > 
V(a: Set) (Q : a — Set) (x : Lama) — Lam‘ Qx — PaQx 


for lambda terms. As usual, the structural induction rules for these types can be 
recovered by setting Q = Ax. True in their deep induction rules. Moreover, the 
basic technique described in Section 2 can be recovered from the more general 
one described in this section by noting that the type arguments to ADT data 
type constructors don’t change, and that the internal predicate parameter to P 
can therefore be lifted to the outermost level of ADT induction rules. 

We conclude this section by giving both structural and deep induction rules 
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for the following truly nested type of bushes [8]: 
Busha = BNil | BCons a (Bush (Bush a)) 


(Note that this type is not even definable in Agda.) Correct and useful structural 
induction rules for bushes and other truly nested types have long been elusive. 
One recent effort to derive such rules has been recorded in [10], but the approach 
taken there is more ad hoc than not, and generates induction rules for data types 
related to the nested types of interest rather than for the original nested types 
themselves. To treat bushes, for example, Fu and Selinger rewrite the type Busha 
as NBush (Succ Zero) a, where NBush = NTimes Bush and 


NTimes :: (Set — Set) — Nat — Set — Set 
NTimes p Zeros =8 
NTimes p (Succ n) s = p (NTimes pn s) 


Their induction rule for bushes is then given in terms of these rewritten ones as 


V (a : Set) (P : V (n : Nat) — NBushna — Set) > 
(V(x : a) —> P Zero x) > 
(V (n : Nat) — P (Succ n) BNil) > 
(V (n : Nat) (x : NBushna) (xs : NBush (Succ (Succ n)) a) > 
Pnx — P (Succ (Succ n)) xs — P (Succ n) (BCons x xs)) > 
V (n : Nat) (xs : NBushna) > Pn xs 


This approach appears promising, but is not yet fully mature. The core diffi- 
culty is that, although Fu and Selinger “hint at how the construction ... can 
be generalized to arbitrary nested types” and “give an example of nested data 
type [sic] that is hopefully general enough to make it clear what one would do 
in the general case” in Section 5 of [10], they do not show how to derive their 
induction rules in a uniform and principled way even for the “reasonably arbi- 
trary and general” nested types they consider. As a result, it is unclear what 
guarantees that the induction rules they derive are correct, either for the original 
nested types or for their rewritten versions, or whether the induction rules for 
the rewritten nested types are sufficiently expressive to prove all results about 
the original nested types that one would expect to be provable by induction. 
This latter point echoes the issue with Coq-derived induction rules for forests 
mentioned above, and has the unfortunate effect of forcing users to manually 
write induction (and other) rules for such types for use in that system [17]. 
Direct application of the general technique illustrated above and explicated 
in full in Section 4 below derives the following first-ever useful induction rule for 
bushes, respectively — a full 20 years after bushes were first introduced! 


V(P : V(a : Set) — Busha — Set) > 
(V(a: Set) — PaBNil) —> 
(V(a: Set) (x : a) (y : Bush (Bush a)) — P (Busha) y > Pa(BConsxy)) > 
V(a: Set) (x : Busha) —> Pax 
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In the next section we will see that this rule is derivable from the following 
more general one: 


V(P:V(a: Set) — (a — Set) — Busha — Set) —> 
(V (a: Set) (Q : a — Set) — PaQBnil) — 
(V (a: Set) (Q : a — Set) (x: a) (y : Bush (Bush a)) > 
Qx — P (Busha) (Pa Q) y > P aQ (BCons x y)) > 
Y (a : Set) (Q : a — Set) (x : Busha) — Bush^ Qx — Pa Qx 


4 Theoretical Foundations 


This section gives a grammar generating a robust class of nested types, including 
ADTs and truly nested types, and recaps the semantics given in [13] for them 
from which we derive their deep induction rules. This entire paper can thus be 
read as a practical application of the abstract results of [13]. 


4.1 Categorical Preliminaries 


We write a: A if A is category and a is an object of A. We write 04 and 14 
for the initial and terminal objects of A, and o4 and !4 for the unique maps 
04:04 —> A and !4 : A —> 14, respectively. If A is the category Set of sets and 
functions between them, we write 0 for Oset, i.e., for Ø, and 1 for any 1-element 
set, i.e., for lse. If a: A we write Ka for the constantly a-valued functor on A. 
The category Fam, which we will use to interpret predicates, is given by: 


Definition 1. The category Fam comprises the following: 


— Objects: An object of Fam is a pair (A, P) where A: Set and P : A — Set. 

— Morphisms: A morphism f : (A,P) — (A', P') in Fam is a pair (a, B), 
where a: A— A’ and B : Ma4 Pa > P' (aa). 

— Identities: The identity morphism id;4 p) : (A, P) > (A, P) in Fam is 
(ida, àa : A.idpa). 

— Composition: If (a, p) : (A, P) > (A', P’) and (a’, B’) : (A', P') —> (A”, P”), 
then the composition (a’, 8') o (a, B) : (A, P) — (A”, P”) in Fam is defined 
by (a’, B’) o(a, B) = (a’ oa, àa : A. 8' (aa) o Ba). 


4.2 Syntax and Semantics of ADTs 


If V is a countable set of type variables, V C VY is finite, a € V, and we write 
V,a for V U {a}, then the following grammar generates (representations of) all 
standard polynomial ADTs over V, i.e., all ADTs defined over data of primitive 
types: 
AY = O|1llaeV| AY +AY | AY x AY | pa.AY 

The grammar A = Uy, AY also generates (representations of) deep ADTs, i.e., 
ADTs defined not just over data of the primitive types, but over data of other 
ADTs as well. For example, it generates the representation List a := w3.1+ax 8 
of the type List a, the representation Forest a := up.1+a x py.1+6x7 of the 
type Forest a, and the representation uô. 1+(wG.1+ax py.1+8x-7¥)x 6 of the 
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type List (Forest a). Using Bekié’s Lemma, it can also generate (representations 
of) ADTs defined by mutual recursion such as Expr := pa. s(a, ub. t(a, 3)) and 
BEzpr := puZ.t(Expr, 3), where s(a, 8) := Nat +axa+8xaxa and 
tla, B) := Bool + Bx B + B + axa for the ADTs of integer and boolean 
expressions from Section 1. ADTs with more than one type argument can be 
handled by tupling them into one or, equivalently, by noting that such ADTs 
are generated by the extension M of the grammar A given in Section 4.4. We 
adopt the usual conventions regarding free and bound type variables for A. 
As usual, ADTs are interpreted relative to environments. 


Definition 2. A set environment o is a function from a finite subset V of V 
to Set. We write Envect for the set of set environments whose domain is V. If 
A € Set, o € Envŝ*, “and a Z V, then ola := A] is the set environment with 
domain V,a that extends o by mapping a to A. We write ca in place of o(a) 
for the image of a under c, and || for the set environment with domain V = Q. 


It is well-known that the ADTs generated by the grammar A have initial 
algebra semantics in the category Set. That is, each such ADT pa. E can be 
interpreted as the carrier uF of the initial algebra for the polynomial endofunctor 
F on Set that interprets its body E. In particular, the final clause of the next 
definition is well-defined. 


Definition 3. The interpretation function [-]$* : AV — Env$s = Set is: 


[o]S*to = 0 
[to = 1 
Ja]Stto = ac 
[Ei + E]§**to — [Eto + [E2] 0 
|E x Ep] Sto = [E.] Sto x [E2]Sto 
[ua E]*0 = (A [Eola = A) 


Like Set, the category Fam has sufficient structure to interpret ADTs gener- 
ated by the grammar A. In particular, it interprets bodies of polynomial ADTs. 


Definition 4. The category Fam supports the following constructions: 


— Initial object: The initial object 0 of Fam is (0, Ko : 0 — Set). For (A, P): 
Fam, (04, AT : 0.0p(o4x)) : 0 —> (A, P) is the unique map from Q to (A, P). 

— Terminal object: The terminal object 1 of Fam is (1, Kı : 1 — Set), where 
() is the unique element of the set 1 and Kı() = 1. For (A,P) : Fam, 
(!4,Aa: A.!pa): (A, P) > 1 is the unique map from (A, P) to 1. 

— Coproducts: Given (A, P),(A’, P’) : Fam, the coproduct (A, P)+(A’, P’): 
Fam is (A+ A’,P+ P’), where P+ P’: A+ A’ — Set is just the usual 
coproduct of P and P' as functions. The associated injections inL : (A, P) > 
(A, P)+(A’, P’) and inR : (A’, P’) — (A, P) + (A’, P’) are given by inL = 
(inL, Aa: A.idpa) and inR = (inR, Aa’: A’. id pia). The coproduct (a, B) + 


(a’, 6") : (A, P) + (A, P') > (B,Q) of morphisms (a, p) : (A, P) > (B,Q) 
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and (a’, 3’) : (A’, P’) = (B, Q) is (ata’,6), where 6: Treaza'(P+P')x > 

Q((a + a’)x) is defined by 6(inLa) = Ba and 6(inRa’) = B'a’. As expected, 

((a, 8) + (a, B")) oink = (a, p) and ((a, B) + (a’, 6") 2 inR = (a, 6"). 

— Products: Given (A, P),(A’, P’) : Fam, the product (A, P) x (A’, P’) : Fam 
is (A x A’,X\(a,a’) : Ax A’. Pa x P'a’). The associated projections Tı : 
(A, P) x (A', P') — (A, P) and m2 : (A, P) x (A', P') — (A, P') are given 
by mı = (mı, A(a,a') : A x A. T1) and m2 = (T2, A(a,a') : A x A’. T2). The 
product (a, 8) x (a’, 8’) : (A, P) > (B, Q) x (B',Q') of morphisms (a, 3) : 
(A, P) > (B,Q) and (a’, 6") : (A, P) — (B',Q') is (Aa: A.(aa, a'a), Aa : 
A. àx : Pa. (pax, p'ax)). As expected, Tı o ((a, 8) x (a’, ’)) = (a, 8) and 
T20 ((a, 8) x (a, B')) = (a’, 6°). 

To interpret ADTs generated by A in Fam we also need to be able to interpret 
expressions of the form ua.E. This we do by computing the least fixed point in 
Fam of the functor G : Fam — Fam interpreting FE. It is natural to try to do 
this using the same technique in Fam that gives its Set-interpretation, i.e., by 
iterating G w-many times starting from the initial object 0 of Fam. This gives 
the least fixed point uG of G as the colimit G”0 in Fam of the sequence 


0 = CUS Go... So GIS... (*) 


This approach is indeed viable, and is formally justified by [13]. There, it is 
shown that if A is a regular cardinal, C is a locally \-presentable category, and 
G:C — C is a d-accessible functor drawn from a particular class of functors 
that goes far beyond just first-order polynomial ones, then the least fixed point 
uG of G exists in C and can be computed as the transfinite colimit G0 of the 
sequence 0 <> G0 @ G70 > ... 6 G70... GS G0... GS G20 © ... over 
all a < à. That the sequence (*) computes uG for all polynomial functors on 
Fam then follows by taking \ to be w, noting that Fam is locally finitely presentable, 
and recalling that all such functors are w-accessible. That (*) further computes 
uG for every functor G on Fam that interprets an expression generated by A 
now follows easily by structural induction. We record this as: 


Theorem 1. If G : Fam — Fam is a functor interpreting an expression (with 

a distinguished variable) generated by the grammar A, then the least fixed point 

uG of G (with respect to that variable) is G0. Concretely, the colimit G0 can 

be computed as lim (An, Pr) = (A, P), where A = lim _ An with mediating 
—neN T 7neN 


morphisms an : An — A, and P is defined by Px = lim Pry. 


—nEN,yean (x) 
To define interpretations in Fam for ADTs generated by A we need the following 
analogue of Definition 2: 


Definition 5. A predicate environment p is a function from a finite subset V of 
Y to Fam. We write Env" for the set of predicate environments whose domain 
is V. If (A, P) € Fam, p € Envi", anda g V, we write pla := (A, P)] for the 
predicate environment with domain V,a that extends p by mapping a to (A, P). 
We write ap in place of p(a) for the image of a under p. 

Leta € Env, If p € EnvẸ?™ is such that mı (ap) = ao for alla € V then 
we say that p is a lifting of o. We write F for the particular lifting p of o such 
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that ap = (ao, Kı) for all a € V. In addition, if p € Envi?™ maps each a € V 
to (Aa, Pa) then we write mp for the set environment with domain V mapping 
each a € V to Aa. We write |] for the unique environment with domain V = 0. 


We then have the following Fam-interpretations for ADTs generated by A: 


Definition 6. The interpretation function [-]F?™ : AY 3 Envi?" — Fam is: 


[o]*"p = 0 
pp = 1 
[a]™™p = ap 


[Ei + Eo)" = [EL]? + [E] p 
[E x Ey)" = [E] x [E] p 
[ua.E] =p = u(Z > JE]™™ pla := Z)) 


Before showing how to derive induction rules for the ADTs generated by A we 
prove two crucial lemmas linking their Set- and Fam-interpretations. 


Lemma 1. If E € AY and p € EnvẸ™, then mi ([E]™p) = [E]S*(a1p). Fur- 
thermore, if T2(Bp) = Kı for all B € V, then mo(JE]™™p) = Kı. 


Proof. By induction on the structure of expressions. The only non-trivial case 
is for ua. E € AY. Let p € Envi?" be given. Letting F : Set — Set be defined 
by FA = [E]$*(71p)[a := A] and G : Fam — Fam be defined by G(A, Q) = 
[E]™ pla := (A, Q)], the induction hypothesis gives 


m™(G(A, Q)) = m ([E] "pla := (A, Q)]) = [EP (mp)la:= A= FA (t) 


and if 72(6p) = Ky for all 8 € V then, moreover, 72(G(A, K1)) = Kı. We then 
have m ([ua-E]™p) = m(u((A,Q) > [E] pla := (A,Q)])) = m(wG) = 
m (lim, _G"0) = lim _,m(G"0) = lim op F"0 = wF = p(A + [E]S*(mp)[a 
:= A]) = [ua.E]S**(71p). Here, the fourth equality is justified by Theorem 1, 
and the fifth is justified by (f) and induction on n. If mo(6p) = Kı for all 
B € V as well, then mo([ua.E]"™p) = m2(u((4,Q) > [E]™" pla := (A,Q)])) = 
T2(uG) = mo(lim  _,,G"0) = mao(lim  _(F"0, K1)) = Av. lim en yeaz EY = 
Kı. Here, the morphisms a, : F”0 — uF are the mediating morphisms for the 
colimit, as in Theorem 1, and the fourth equality is justified by the fact that 


12(G(A, Kı)) = Kı and induction on n. 


Corollary 1. If E is closed then [E]'*™[] = ((E]S*[], K1). 


Lemma 2. Ifo € Envy, and if F : Set — Set and G : Fam — Fam are 


given by FA = [E]Sola := A] and G(A,Q) = [E] "Tja := (A,Q)], then 


Proof. We have uG = p((A,Q) = JE] rla := (A,Q)]) = [ua. E] "7 = 
([ua.E]S*o, K1) = (uF, K1), where the third equality holds by Lemma 1. 
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4.3 Induction Rules for ADTs 


To derive induction rules for the ADTs generated by A, we first observe that, 
given an ADT pa.E € AY and a set environment ø € Envect interpreting its free 
variables, the interpretation [E]5*to defines a functor F,A = [E]$*o[a := A] 
such that [ua.E]S*o = p(A = [E]Stola := AJ) = u(A = F,A) = pF. We can 
therefore think of F, as representing the data type constructor associated with 
the ADT. Thus, as argued in [11,12], the semantic induction rule for proving 


predicates over the o-instance of the ADT pa.F has the form 
V(P : pF > Set). ??? > V(x : Fo). Px 


for some appropriate hypotheses ???. We can use the Fam-interpretation of E to 
discover a semantic counterpart to the hypotheses ???. Reflecting the resulting 
semantic rule for the o-instance of wa.E back into the programming language 
syntax will then derive induction rules for polynomial ADTs. 

To deduce what ??? is, we first observe that the conclusion V(x : uF). Px 
of the induction rule for the o-instance of wa.E is isomorphic to the type of the 
second component of a morphism in Fam from (wF,, K1) to (uF,, P) whose first 
component is id. Lemma 1 suggests that if we can see (uF, K1) as uG for some 
functor G : Fam — Fam, then we can fold over a G-algebra on (uF, P) in Fam 
to get such a morphism, i.e., to inhabit the type that is the structural induction 
rule for the o-instance of ua.E. This will provide a proof indye.z P that the 
property P holds for all elements of the o-instance of pa.F. 

To this end, let p € Enve™ be any lifting of ø, and consider again the functor 
F(A, Q) = [E]F?™ pla := (A, Q)] on Fam given in Lemma 1 (there called G). An 
F,-algebra structure on (uF, P) is a morphism (k', k) : F,(uF,,P) > (uFs, P) 
in Fam. Then m(F,(jF>, P)) = m(LEJF™ pla := (Fy, P)]) = (m ((E]F™p))a 
:= uF] = [E]tola := pF] = F,(uF,), with the third equality holding 
by Lemma 1. If we take k’ = in, then k : V(x : F,(uF,)).7o([E]?™ ola := 
(uF, P)])x > P(in x), so that 


ind ua. E, p : V(P : Fs > Set). 

(Wa : Fẹ(uFo)). T2(LE]F™ pla := (Fy, P)])a > P(in z)) 

— V(x: Fo). Px 

ind po.B,pP kx = T2 (foldt2 sp (in, k)) x () 
Here, foldie p, (in, k) is the unique F,-algebra morphism from in : Ê, (uF) —> 
uF, to (in, k) in Fam. 
~ Taking p = @ in the above development derives the expected structural 
induction rules for ADTs generated by A. But this development is actually 
far more flexible, since the induction rule it derives is parameterized over an 
arbitrary lifting p of the set environment g, and later specialized to F to obtain 
structural induction rules for ADTs. The non-specialized rule can therefore be 
used to prove properties of ADTs that are parameterized over non-trivial (i.e., 
non-/t;) predicates on the type parameters to the type constructors induced by 
those ADTs; these are precisely our deep induction rules for ADTs. 
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As expected, the conclusion of an ADT’s deep induction rule will have an 
additional hypothesis involving the lifting of this predicate to that ADT. As we 
have seen, the ability to lift a predicate Q on a set A to a predicate Tg on TA, 
where T is an ADT’s type constructor, is therefore central to deep induction. 
Every type constructor for every ADT generated by the grammar A has such a 
lifting. Concretely, it is computed as the second component of the interpretation 
in Fam of that data type. For example, the lifting Listo : List A — Set is 
mlu8.1 +a x B]F™[a := (A, Q)]. This can be coded in Agda as 


List’ : V{a: Set} — (a — Set) — (List a — Set) 
List’ QNil = T 
List’ Q(Consxxs) = Qx x List‘ Qxs 


Example 1. The deep induction rule for lists can be computed as the type of 
ind List x, p for the ADT Lista := up.1 +a x p and the predicate environment 
p = [a := (A,Q)] for (A,Q) € Fam. Letting FY = [1 +a x B]S*(mp)[6 := 
Y]=1+Ax/Y with the obviously named injections, we have that uF = List A. 
This gives the deep induction rule 


ind hist a,p : V(P : pF —> Set). V (Q : A — Set). 
(Ve: F(WF). m2 (D + a x AJF" [o := (4, Q), 8 := (uF, P)))e > 
P(inx)) > V(a: pF). Listo x => Px 


Simplifying 72’s argument gives (1, K1) + (4A, Q) x (uF, P). Its predicate part, 
obtained by applying m2, is Kı + (Q x P), so the hypotheses for ind List a,p are 


V(x: 1+ Ax List A).(Kı + (Q x P))a > P(in x) 
= (V(x : 1). 1 —> P Nü) x (Y(y : A).Y(ys : List A).Q y —> P ys — P (Cons yys)) 
= P Nil x (V(y: A).Y(ys : List A). Q y > P ys > P (Cons yys)) 


Reflecting back into syntax gives the deep induction rule from Section 1: 


V (a: Set) (P : List a — Set) (Q : a — Set) — 
P Nil — (Y(y : a) (ys: List a) — Qy —> P ys — P (Cons y ys)) > 
V(xs : List a) — List“ Q xs > P xs 


Taking Q = K; gives the usual structural induction rule for lists from Section 1. 


Example 2. Since Foresta and List (Foresta) are mutually recursively de- 
fined, the deep induction rule for forests is defined by mutual recursion with 
the deep induction rule for lists. It can be computed as the type of ind Forest a, p 
for the ADT Forest a := wG.a x py.1+ 8 x y using the same technique as in 
Example 1. This gives the (deep) induction rule for forests from Section 1. 


Example 3. Exactly the same technique delivers the deep induction rules from 
Section 1 for the mutually recursive ADTs Expr and BExpr whose representations 
are given before Definition 2. 
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4.4 Syntax and Semantics of Nested Types 


We can use the technique from Section 4.3 to derive induction rules for nested 
types as well, including truly nested types and other deep nested types. To do 
so we first need an extension of the grammar A that generates these types. 

Since nested types generalize ADTs to allow elements of a nested type at one 
instance of a type to depend on data at other instances of that nested type, they 
are interpreted as least fixed points not of ordinary (first-order) functors on Fam, 
as ADTs are, but rather as least fixed points of higher-order such functors. More- 
over, since nested types can be parameterized over any number of type argu- 
ments, the (first-order) functors interpreting them can have correspondingly ar- 
bitrary arities. For each k > 0 we therefore assume a countable set F* of functor 
variables of arity k, disjoint for distinct k. We use lower case Greek letters for 
functor variables, write y* to indicate that y € FF, and say that y has arity k 
in this case. Type variables are exactly functor variables of arity 0; we continue 
to write a, 3, etc., rather than a°, 6°, etc., for them. We write F = Uso F". 
If V C F is finite and y € F* for some k, write V, ọ for V U {y}. 


Definition 7. For a finite set V of F, the set of (truly) nested data types over 
V is generated by the following grammar: 


NV := 0/1] NY | NV NY | NY & NY | (up dar... NV PV 


Here, p% € V and the lengths of the vectors of terms in NV in the third and 
final clauses of the above grammar are both k. 

The grammar N = Uy NY generalizes A by allowing recursion not just at the 
level of type variables, but also at the level of functor variables. This reflects the 
fact that, in programming language syntax, nested types can be parameterized 
over both types and type constructors. For example, MVY generates the represen- 
tation PTree a := (up!.AZ.8 + v(8 x B)) a € N° of the type PTree a, the repre- 
sentation Lama := (uy!.v8.8 + eB x vB + p(B+1)) a € N° of the type Lama 
and the representation Bush a := (up! .rZ. 1+Bx p(y B)) a € N® of the type 
Busha. But it also generates the representation GForest pa := up.1+a xpp E 
N°: of the following nested type of generalized forests with data of type a: 


GForest fa = FEmpty | FNode a (f (GForest f a)) 


This type is higher-order in the sense that the type constructor GForest takes 
not just a type, but also a (unary) type constructor, as an argument. It therefore 
cannot be expressed as an element of A, and thus demonstrates the benefit of 
working with the more expressive grammar NV. On the other hand, it is decidedly 
ADT-like, in the sense that it defines a family of inductive types rather than an 
inductive family of types. In fact, if f were a type constructor induced by a 
nested type generated by our grammar, then GForest fa and f (GForest f a) 
would be mutually recursively defined. In this case, generalizing Example 2, 
their structural induction rules would also be defined by mutual recursion. 
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It is not hard to see that A C M. Moreover, the grammar M allows nested 
types to be parameterized over (other) nested data types, just as A allows ADTs 
to be parameterized over (other) ADTs. For instance, we could have perfect trees 
of lists or binary trees, bushes of perfect trees, etc. 


We have the following notions of functor and application in Fam: 


Definition 8. A (k-ary) lifted functor G : Fam” — Fam is a pair (F, P), where 
F : Set” — Set and P : V(X1,P,)....(Xp, Py). FX1...X, —> Set is a Fam- 
indexed predicate. The application of a functor (F,P) : Fam* — Fam to an 
object (A1, Q1), ----, (Ak, Qk) of Fam” is given by 


(F, P)(A1, Q1)...(Ag, Qe) = (FA1...Ak, P(A1, Q1). (Ak, Qk)) 


We call a lifted functor G = (F, P) a lifting of F from Set to Fam, and call P 
a Fam-indexed predicate. A Set-indexed predicate is a Fam-indexed predicate that 
does not depend on its arguments’ second components. We extend the notions of 
set environment and predicate environment from Definitions 2 and 5 as follows: 


Definition 9. A set environment ø is a mapping from a finite subset V = 
to, pEr} of F such that pio : Set* — Set for i = 1,...,n. We write Enver 
for the set of set environments whose domain is V. If F € Set” — Set, o € 
Env$tt, and p* ¢ V, we write aly := F for the set environment with domain 
V,y that extends o by mapping y to F. Similarly, a predicate environment p is 
a mapping from a finite subset V = tot, pEr} of F such that pip : Fam* > 
Fam is a lifted functor for i =1,...,n. We write Envir for the set of predicate 
environments whose domain is V. If (F,P) € Fam* — Fam, p € Env\?™, and 
pk ZV, we write ply := (F,P)] for the predicate environment with domain V, p 
that extends p by mapping to (F, P). 


The notions of a predicate environment being a lifting of a set environment and 
the notations F, 7p, and [| are now extended in the obvious ways. 

The following interpretations of nested types generated by M in the locally 
finitely presentable categories Set and Fam are shown in [13] to be well-defined: 


Definition 10. The interpretation functions [-]S* : NY — Env — Set and 
[-JFa™ : MY — Envy?” — Fam are: 


[o]Stto = 0 
jSto =i 
[e* F1...Ex oo = (po)([E:]S*o) 
[Fy + E2] Sto = [Ei poto + [E2] Seto 
[E x E2] Sto = [Ei oto x [Ea] Seto 
[(uy*.A01...0%.£)E)...E,] Sto = (u(F + A... Ap. 
[E] tola; := Ailly := F))) (TE S*c) 
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[0] =" p =0 
yep = 1 
[p*E1...Ex] "0 = (vp) (E: 
[Fi + E|" p = [E] p + [E2] p 
[E x E|™p = [E] x [E] p 
[(up*.AQ1...a%. E)E1...Ep] "p = (u(F = AZ1...Zp 


[E] ola: = Zilly = F) (Ep) 


4.5 Induction Rules for Nested Types 


Straightforward generalization of the analysis in Section 4.3 to M gives induc- 
tion rules for the type constructors nested types induce. Given a nested type 
(uo .\a1..-ap. E)E1...Ep € NY with type constructor T = py*.Aa1...a,. E and 
a set environment o € Envy" interpreting its free variables, we have that 


[TE] "0 = pF > AAy...Ag. [EP ola = Aly = F(E] +0) = (Ho) (TE) 
where the higher-order functor H, on Set is defined by 
H,FA\...Ap = [E] tola; := Aillo := F] 


For any lifting p of ø, the predicate counterpart to H, is the higher-order functor 
H, on Fam whose action on a k-ary lifted functor (F, P) is the k-ary lifted functor 
H,(F, P) given by 


A, (F, P) (41, Q1)..-(Ax, Qe) = [EJP ™ pla = (Ai, Qa] ly == (F, P)] 


The induction rule indr, p for proving predicates over the o-instance of the type 
constructor T relative to the lifting p is thus given by 


indr,»: V(P:V(Ai, Q:).(uHs) Ai > Set). 
(V(Ai, Qi). m2(Hp(uHe, P))(Ai, Qi) > P(Ai, Qi) > 
(VA Qi). 72 (UH, (A; Qi) > P(Ai, Qi)) 
=V(P : V(Aj.Q;).(u, ae — Set). 
(V(Ai, Qi). V(x : He (uHe) Ai). 
Hy (uHe, P))(Ai, Qi) £ > P(Ai, Qi) (in x)) > 


mlk 
(V(Ai, Qi). V(a : (pHo) Ai). r2(u Pp) (Ai, Q) £ > P(Ai, Q:)2) 
indr, p = À P k (A, Qi). rra( fold 2™ (in, k)) 


To get analogues for nested types of the structural induction rules for ADTs 
note that, since each o-instance of the type constructor T = py".Aa1...a%. E 
associated with a nested type (up*.AQy...0%.E)E)...E, € NY gives rise to an 
inductive family of types, the appropriate notion of predicate for a nested type 
is actually a Set-indexed predicate. By direct analogy with structural induction 
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for ADTs, the structural induction rule for a nested type with type constructor 
T whose o-instance is interpreted by wH, is then 


V(P : WA; Sn ) A; — Set). 
(VA. V(x : Ho (Ho) Ai). 72 (Ha(wHe, P), K) £ > PUA, Kilin 2)) > 
(WAj. V(x : (uH,)Ai). m2 (He )(Aj, Kı) £ > P(A; Ki)2) 

=V(P:VA; Hee ) A; — Set). 
(VA; W(x : Ho (uHe)Ay). m2 Ho(wHe, P) K) £ > PUA, Ki)(in2)) > 
(VAi. V(x : (uH,)A;). P A;r) 


(t) 


where P is defined below. To see that the structural induction rule (t) is indeed a 
specialization of indr, p, suppose we are given a predicate P : V(A;, Qi). (Ho) Ai 
— Set for a nested type with type constructor T whose o-instance is interpreted 
by wH,, together with induction hypotheses 


R = VA, N (a : Ho(uHo)Ai). 12(Hs(uH,, P))(Ai, Ki) £ > P(A;, Ki) (in x) 


Let P : V(Aj, Qi). (uHe )A; — Set be the Fam-indexed predicate P = MA; Qi). 
PA;, and consider the instantiation indT, 7 Ê R, where the induction hypothesis 
R : WAi, Qi). V(x : Hg (He) Aj). T 2(Ĥs(uHo, P))(Ai ,Qi)t > P(A, Qi )(in x) 
for indr,» is given by R(Ai,Q) cy = RA; a (mo(He(uH, ÊP) t) ry). 


5 The General Methodology 


We can distill from the foundations given in Section 4 a general methodology 
that will derive correct deep induction rules for any nested type generated by 
N. Concretely, this methodology comprises the following steps: 


1. Given a nested data type definition D, translate its type constructor into an 
expression N in the grammar M (or, more simply, A, if D defines an ADT). 

2. Interpret N in Set to get a fixpoint equation defining D as wH for some 
(higher-order) operator H. 

3. Reinterpret N in Fam to define a corresponding (higher-order) operator Hon 
predicates whose fixed point u is an inductive predicate on uH, i.e., on D. 

4. Initiality of uf guarantees that there is a unique predicate morphism from 
pH to any other predicate P admitting an H -algebra structure. This gives 
D’s deep induction rule. 


These are precisely the steps carried out in all of our examples, including those 
below, which illustrate the derivation for nested types given in Section 4.5. 


Example 4. Since the nested type Lama := (uy'.v8.8 + vB x pp + p(B+1)) a 
of lambda terms is uniform in its index a, it induces a type constructor Lam := 
uy’ .r+8.8 + ob x pB + y(G+1). Writing H for Hy and H for Ay , and letting 


HFA = [6+ 96x 964+ 9(64+1)P[6 := Allg := F] = A+FAxFA+F(A+1) 
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we have that uH = Lam and that the predicate counterpart H to H is given by 


Å (F, Ê) (A.Q) =[6+ 96 x 96 + 9(6 + IF" (6 = (A, Q)ly := (FP) 
(A, Q) + (F, P)(A, Q) x (F, P)(A, Q) + (F, P)((A, Q) + (1, K1)) 
(A+FAx FA+ F(A ed), 


m2((A, Q) + (F, P)(A, Q) x (F, P)(A, Q) + (F, P)((A, Q) + (1, K1))) 


Reflecting uĤ back into syntax gives the inductive predicate 


Lam’ : V(a: Set) — (a — Set) — (Lama — Set) where 
ar^ : V(a : Set) (Q : a — Set) (x: a) > Qx > Lam’ aQ (Var x) 
App’ : V(a : Set) (Q : a — Set) (x : Lama) (y : Lama) > Lam“ aQ x > 
Lam^aQy — Lam‘ aq (Appxy) 
bs” : V(a : Set) (Q : a — Set) (x : Lama) — Lam’ (Maybe a) (Maybe^ a Q) x > 
Lam’ aQ (Abs x) 


Now, if P is any other predicate on Lam admitting an ÎĤ -algebra structure, then 
there must exist a morphism k : V(x : A+ Lam A x Lam A+ Lam( A+ 1)). (Q + 
PAQxPAQ+P(A+1)((+1)^Q))z = PAQ (in x), i.e., k = (kı, k2, k3), where 


kı : V(x : A).Qa —> P AQ (Var zx) 
k2 : V(x : Lam A).V(y: Lam A). P AQ x —> P AQy —> P AQ (Appzy) 
k3 : V(x : Lam (A +1)).P (A+ 1) ((+1)^ Q) z — P AQ (Abs x) 


Since Lam’ reflects the initial H- algebra, there is a unique algebra morphism 
from in : H(uwH) > pH to the H-algebra k on P, i.e., from uÅ to P. Reflecting 
this morphism back into syntax gives the deep induction rule for lambda terms 
from Section 3. 


The deep induction rule for lambda terms can be used to prove, e.g., prop- 
erties of lambda terms whose variables are represented by prime numbers or 
lambda terms over strings that can represent variable names. It can also be used 
to prove properties of lambda terms over lambda terms, such as the associativity 
laws needed to show that the functor Lam is a monad; such a proof is included 
as the first case study in the accompanying Agda code. The second uses deep 
induction rule we derive in Example 5 to prove some results about bushes. 


Since truly nested types are a special case of deep nested types, our method- 
ology can derive useful induction rules for them — including the perpetually 
problematic truly nested type of bushes [8,10,15] introduced in Section 3. 


Example 5. Since the truly nested type Bush a := (upl.ArZ. 14+8xyly B)) ae 
N is uniform in its index a, it induces a type constructor Bush := py! .rAB.1+ 
Bx (pp). Writing H for Hy and Å for Ah , and letting 


HFA = [14+6x ¢(v{)[*o[6 := Ally := F] = 1+ Ax F(FA) 
we have that „H = Bush and the predicate counterpart H to H is given by 


H(F,P)(A,Q) = [1 + 8x p (p AJT := (A, Qly = (F, P)] 
= (1, K1) + (4, Q) x (F, P)((F, P)(4,Q)) 
= (1+ Ax F(FA), Kı +Q x m((F, P)((F, P)(A, Q)))) 
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Reflecting uÑ back into syntax gives the inductive predicate 


Bush’ : V(a : Set) — (a — Set) — (Busha — Set) where 
BNil’ : V(a: Set) (Q : a — Set) — Bush’ aQBNil 
BCons^ : V(a : Set) (Q : a — Set) (x: a) (y : Bush (Bush a)) > 
Qx — Bush’ (Bush a) (Bush^ Q) x — Bush’ aQ (BCons x y) 


Now, if P is any other predicate on Bush admitting an H -algebra structure, then 
there must exist a morphism 


k: V(a:1+ Bush (Bush A)). 


(Kı +Q x 12((Bush, P)((Bush, P)(A, Q))))« > PAQ (in x) 
= V(x : 1 + Bush (Bush A)). (Kı + Q x P (Bush A) (PAQ))a —> PAQ (in x) 


ie., (kı, k2), where kı : V(x : 1). 1 — P AQ BNil and kə : V(x : A). Vy : 
Bush (Bush A)).1 — P (Bush A)(PAQ)y —> PAQ(BCons xy). Since Bush’ 
reflects the initial H -algebra, there is a unique predicate morphism from pH to 
P. Reflecting this morphism back into syntax gives the deep induction rule for 
bushes from Section 3. 


The function BDind => MBDind in our Agda code shows that our methodology also 
derives a mutually recursive deep induction rule for bushes, there called MBDind. 
Examples 4 and 5 show that when the definition of a nested type contains an 
instance of another nested type constructor C — e.g., Maybe a in the argument 
Lam (Maybe a) to Abs — its inductive predicate definition, and thus its deep in- 
duction rule, will involve a call to the predicate interpretation C^ of C. When 
the definition contains an instance of the constructor for the same type being 
defined — e.g., Busha in the type argument Bush (Bush a) to BCons — its induc- 
tive predicate definition, and thus its deep induction rule, will involve a recursive 
call to the inductive predicate being defined. The treatment of a truly nested 
type is thus exactly the same as the treatment of any other nested type. 
Independently of deriving induction rules, even defining some nested types in 
Agda requires turning off its termination checks in a few tightly compartmental- 
ized places. For example, neither Coq nor Agda currently allows the definition 
of the bush data type because of the non-positive occurrence of Bush in the type 
of BCons. The correctness of our development in those places is justified by [13]. 
This work suggests that the current notion of positivity should be generalized. 


6 Related Work and Directions for Further Investigation 


As far as we know, the phenomenon of deep induction has not previously even 
been identified, let alone studied. This paper treats deep induction for nested 
types, which extend ADTs by allowing higher-order recursion. Other general- 
izations of ADTs are also well-studied in the literature, including (indexed) 
containers [1,2], which extend ADTs by allowing type dependency. In partic- 
ular, [3] defines a class of “nested” containers corresponding to inductive types 
whose constructors can recursively depend on the data type at different instances 
than the one being defined. The case of truly nested types is not treated there, 


Deep induction 357 


however. We hope eventually to extend the results of this paper to derive prov- 
ably correct deep induction rules for (indexed) containers, GADTs, dependent 
types, and other classes of more advanced data types. One interesting question 
is whether or not a common generalization of indexed containers and the class 
of nested types studied here has a rigorous initial algebra semantics as in [13]. 
A more recent line of investigation concerns sized types [5]. These are par- 
ticularly well-suited to termination checking of (co)recursive definitions, and are 
implemented in the latest versions of Agda [6]. Although originally defined in 
the context of a type theory with higher-order functions [4], the current incar- 
nation of sized types does not appear to admit definitions with true nesting. 
What seems to be missing is an addition operation on sizes, which would allow 
a constructor such as BCons to combine a structure with size of depth “up to a” 
with one of depth “up to 8” to define a data element of depth “up to a+ 8”. 
Tassi [17] has independently implemented a tool for deriving induction princi- 
ples of data type definitions in Coq using unary parametricity. Although neither 
rigorous derivation nor justification is provided, his technique seems to be essen- 
tially equivalent to ours, and could perhaps be justified by our general framework. 
True nesting still is not permitted, however. In [7], mutually recursively defined 
induction and coinduction rules are derived for mutually recursive and corecur- 
sive data types. But these are still the standard structural (co)induction rules, 
rather than deep ones. This suggests a need for deep coinduction rules, too. 
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Abstract. Automatic amortized resource analysis (AARA) is a type- 
based technique for inferring concrete (non-asymptotic) bounds on a pro- 
gram’s resource usage. Existing work on AARA has focused on bounds 
that are polynomial in the sizes of the inputs. This paper presents and 
extension of AARA to exponential bounds that preserves the benefits 
of the technique, such as compositionality and efficient type inference 
based on linear constraint solving. A key idea is the use of the Stirling 
numbers of the second kind as the basis of potential functions, which 
play the same role as the binomial coefficients in polynomial AARA. To 
formalize the similarities with the existing analyses, the paper presents 
a general methodology for AARA that is instantiated to the polynomial 
version, the exponential version, and a combined system with potential 
functions that are formed by products of Stirling numbers and binomial 
coefficients. The soundness of exponential AARA is proved with respect 
to an operational cost semantics and the analysis of representative ex- 
ample programs demonstrates the effectiveness of the new analysis. 


Keywords: Functional programming - Resource consumption - Quan- 
titative analysis - Amortized analysis - Stirling numbers - Exponential 


1 Introduction 


“Time is money” is a phrase that also applies to executing software, most directly 
in domains such as on-demand cloud computing and smart contracts where ex- 
ecution comes with a explicit price tag. In such domains, there is an increasing 
interest in formally analyzing and certifying the precise resource usage of pro- 
grams. However, the cost of formally verifying properties by hand is an obstacle 
to even getting projects off the ground. For this reason, it would be desirable if 
such resource analyses could be performed mostly automatically, with reduced 
burden on the programmer. 
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Techniques and tools for automatic and semi-automatic resource analysis 
have been extensively studied. The applied methods range from deriving and 
analyzing recurrence relations [55, 1, 16,2, 12,36, 10, 37], to abstract interpreta- 
tion and static analysis [18, 7, 49, 39], to type systems [11, 56, 53], to proof assis- 
tants and program logics [4, 9, 8, 48, 19, 45, 42], to term rewriting [6,5, 47]. Many 
techniques focus on upper bounds on the worst-case bounds, but average-case 
bounds [15, 35, 43, 54] and lower-bounds have also been studied [3, 17, 44]. 

In this paper, we extend automatic amortized resource analysis (AARA) 
to cover exponential worst-case bounds. AARA is an effective type-based tech- 
nique for deriving concrete (non-asymptotic) worst-case bounds, in particular for 
functional languages. It has been introduced by Hofmann and Jost [31] to derive 
linear bounds on the heap-space usage of strict first-order functional programs 
with lists. Subsequently, AARA has been extended to programs with recursive 
types and general resource metrics [34], higher-order functions [33], lazy evalua- 
tion [52], parallel evaluation [29], univariate polynomial bounds [27], multivariate 
polynomial bounds [23, 25], session-typed concurrency [13], and side effects [38, 
46]. However, none of the aforementioned works explores exponential bounds. 

The idea of AARA is to enrich types with numeric annotations that repre- 
sent coefficients in a potential function in the sense of amortized analysis [51]. 
Bound inference is reduced to Hindley-Milner type inference extended with lin- 
ear constraints for the numeric annotations. Advantages of the technique include 
compositionality, efficient bound inference via off-the-shelf LP solving, and the 
ability to derive bounds on the high-water mark for non-monotone resources 
like memory. A powerful innovation leveraged in polynomial AARA is the repre- 
sentation of potential functions as non-negative linear combinations of binomial 
coefficients. Their combinatorial identities yield simple and local typing rules and 
support a natural semantic understanding of types and bounds. Moreover, these 
potential functions are more expressive than non-negative linear-combinations 
of the standard polynomial basis. 

However, polynomial potential is not always enough. Functional languages 
make it particularly easy to use exponentially many resources just by having two 
or more recursive calls. The following function subsetSum: int list > int > 
bool exemplifies this by naively solving the well-known NP-complete problem 
subset sum. In the worst case, it performs 3 *2!"“s! — 2 Boolean and arithmetic 
operations (where |x| gives the length of the list x). 


let subsetSum nums target = 
match nums with 
| D — target = 0 
| hd::tl — subsetSum tl (target-hd) || subsetSum tl target 


Such a function could appear in a program with polynomial resource usage if 
applied to arguments of logarithmic size. In this case, polynomial AARA would 
not be able to derive a bound. Section 6 contains a relevant example. 

To handle such functions, we introduce an extension to AARA that allows 
working with potential functions of the form f(n) = b”. This extension ex- 
ploits the combinatorial properties of Stirling numbers of the second kind [50] in 
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much the same way that AARA currently exploits those of binomial coefficients. 
Moreover, we allow both multiplicative and additive mixtures of exponential and 
polynomial potential functions. The techniques used in this process could easily 
be applied to other potential functions in the future. 

The paper first details a generalized AARA type system fit for reuse between 
polynomial, exponential, and other potential functions. We then instantiate this 
system with Stirling numbers of the second kind, yielding the first AARA that 
can infer exponential resource bounds. Finally, we pick out the characteristics 
that allow for mixing different families of potential functions and maximizing 
the space they express, and we instantiate the general system with products 
of exponential and polynomial potential functions. To focus on the main con- 
tribution, we develop the system for a simple first-order language with lists in 
which resource usage is defined with explicit tick expressions. However, we are 
confident that the results smoothly generalize to more general resource metrics, 
recursive types, and higher-order functions. As in previous work, we prove the 
soundness of the analysis with respect to a big-step cost semantics that models 
the high-water mark of the resource usage. 


2 Language and Cost Semantics 


Abstract Syntax To begin, we define an abstract binding tree (ABT, see [20]) 
underlying a simple strict first-order functional language. Expressions are in let- 
normal form to simplify the AARA typing rules. For code examples, however, 
we overlay the ABT with corresponding ML-based syntax. For example, 1::]] 
[1], and cons(1, nil) all represent the same list. 

A program prog is a collection of functions as defined in the following gram- 
mar. The symbols lit, binop, and unop refer to standard literal values, binary 
operations, and unary operations respectively, of basic types (int, bool, etc.). 
The symbols f, x, and r refer to function identifiers, variables, and rational 
numbers, respectively. 


prog ::= func{ f }(x.e) prog | € 
e ::= lit | x | binop(a1; x2) | unop(x) | app{f}(a) | let(e1; x.e2) 


| share(x1; x2, 23.e) | tick{r} | pair (x1; x2) | nil | cons(a1; x2) 


? 


| cond (x; e1;e2) | pairMatch(x1; £2, £3.e) | listMatch(a1; e1; £2, £3.€2) 


Expressions include function applications, conditionals, and the usual introduc- 
tion and elimination forms for pairs and lists. They also include two special 
expressions: tick{r} and share. The former, tick{r}, is used to specify constant 
resource cost r. We allow r to be negative in the case of resources becoming 
available instead of being consumed. The latter, share(x1; £2, £3.€), provides two 
copies of its argument x, for use in e. This is useful because the affine features 
of the AARA type system do not allow naive variable reuse. In practice, share 
can be left implicit by automatically preceding every variable usage by share. 
To focus on the technical novelties, we keep function identifiers and variables 
disjoint, that is, the types of variables do not contain arrow types and functions 
are first-order. Higher-order functions can be handled as in previous AARA liter- 
ature [25]. As a further simplification, we only let functions take one argument; 
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multiple arguments can be simulated with nested pairs. Finally, the language 
here only supports the inductive types of lists; future work could extend this to 
more general types as in other AARA literature [38, 25, 30, 28]. 


Operational Cost Semantics To define resource usage, AARA literature uses 
the operational big-step judgment V | e J) v | (q,q') (see e.g. [22]) defined in 
Figure 1. This judgment means that, under the environment V, the expression 
e evaluates to the value v under some resource constraints given by the pair 
q,q'. The environment V maps variables to values. The resource constraints are 
that q is the high-water mark of resource usage, and q — q’ is the net amount 
of resources consumed during evaluation. In other words, if one started with 
exactly as many resources needed to evaluate e, that amount would be q, and 
the amount of leftover resources after evaluation would be q’. It is essential 
to track both of these values to model resources that might be returned after 
use, like space. Space usage usually has a positive high-water mark but no net 
resource consumption, as space could be reused. 

The above big-step judgment only formalizes terminating evaluations. To 
deal with divergence, the additional judgment V H e |} o | q has been intro- 
duced [26]. This merely drops the parts of the previous judgment relevant to 
post-termination, focusing on partial evaluation. It means that some partial 
evaluation of e uses a high-water mark of q resources. Should it exist, the largest 
q such that V F e |} o | q holds would be the high-water mark of resource usage 
across any partial evaluation of e. For a formal definition, see [26]. 


3 Automatic Amortized Resource Analysis 


Here we lay out a generalized version of the AARA system with the poten- 
tial functions abstracted. Existing AARA literature is specialized to polynomial 
functions (see e.g. [27]). This existing polynomial system may be obtained as an 
instantiation, as may the exponential system that we introduce in Section 4. 

AARA uses the potential (or physicist’s) method to account for resource 
use, as is commonly used in amortized analyses. The potential method uses the 
physical analogy of converting between potential and actual energy that can be 
used to perform work. Whereas a physicist might find potential in the chemical 
bonds of a fuel, however, AARA places it in the constructors of lists. 

To prime intuition with an example, consider paying a resource for each :: 
operation performed in the following code. It performs snoc, which is like cons 
but adds onto the back of the list rather than the front. 


let snoc x xs = 
match xs with 
| O — tick 1; x::[] (* pay 1 resource here *) 
| hd::tl — tick 1; hd::(snoc x tl) (* pay 1 resource here *) 


The resource consumption of snoc x xs as defined by the tick expressions 
is 1+ |as|. Using the potential method, we can justify this bound as follows. 
If 1 resource is initially available, then the base case of the empty list can be 
paid for. If there is 1 stored per element of the list then 1 resource is released 
in the cons case of the pattern match. This suffices to pay for the additional :: 
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Fig. 1. Terminating operational cost semantics rules. 


q = maz(r,0) gq! = max(—r, 0) binop(V (x1), V (£2)) > v i 
7 7 Tick - Binop 
V F tick{r} 4 0 | (4,4) V F- binop(xı, x2) 4 v | (0,0) 
; V(z) =v V(a1)=v1 V(a2) = v2 ; 
FS a a Le == ga Nar : Pair 
VE lit 4 lit | (0,0) VE at v | (0,0) VF pair(z1, £2) (v1, v2) | (0,0) 
unop(V(x)) => v V (xp) = (v1, v2) V[ri >v, rave] F e 4v | (a, 4’) 
Unop - 7 PMat 
V H unop(x) 4 v | (0,0) V F- pairMatch(xp; x1, £2.€) 4 v | (¢,q') 


V Fe 4v | (a) Ve v] tF e24 v | (p,p) 


Let 
V H let(e1;x.e2) 4} v2 | (q + maz(p — q',0), p' + maz(q' — p, 0)) . 
V(x) = true Viet iv (q,q’) i V(ap) = false Vier iv (qq) pe 
V E cond(xp;er;er) 4 v | (qq) can V F cond(xp;er;er) 4v | (4,4') m 
func{f}(x'.e) € prog V(x)=vz Vr’ vs]F e4v](q,q') F 
pp 
VE app{f}(x) 4 v | (4,4) 
V(x)=nil VFe dvi (gq) V(tn) =vn V(t) =v 
> LMat0 Cons 
V F- listMatch(x; e1; £h, £t.€2) | v | (q,q') V E cons(ap; £t) 4 vn :: vt | (0,0) 
V(x)=vp ve Vian > vh, £t vt] F e2 4v | (ad) 
: LMat1 : : Nil 
VE listMatch(x; e1; £p, £t.€2) |v | (¢,q') V F nil 4} nil | (0,0) 


V[z2 > V(z1), £3 > V (z1) Fel v | (q,9') 
V F share(z1; £2, £3.€) 4 v | (q, q’) 


Share 


operation. The remaining potential on xs can be assigned to tl for the recursive 
call. One can sum these costs to infer that the initial potential 1 + |as| covers 
the cost of all the :: operations. The AARA type system could describe this with 
the typing L!(Z) for xs (describing the linear potential in the superscript) and 


Z x Li (Z) 19 it (Z) for snoc (describing the initial/remaining resources above 


the arrow). Another valid type is Z x L?(Z) ah L! (Z), which could be used in a 
context where the result of snoc must be used to pay for additional cost. 


Types The AARA system laid out here supports the types given below. The 
symbol F gives the types of functions, where q and q’ are non-negative rationals. 
The symbol S gives the remaining non-function types, where basic stands for 
the basic types like int or unit, and the resource annotation P is an indexed 
family of rationals representing the coefficients in a linear combination of basic 
potential functions. 


Pus ss S ::= basic | L?(S)|Sx S 


The typing rules for these types are given in Figure 2 and explained in the 
following sections. The values of these types are the usual values. 
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Potential To understand typing rules, it is necessary to define potential. The 
following potential constructs are generalized from polynomial AARA work [27]. 
As mentioned, P = (p;)icr is in Q? as an indexed family of rationals. Each 
entry represents a coefficient in a linear combination of basic potential functions. 
This linearity makes it natural to overload the type of P as a vector or matrix of 
rationals, so it is treated as such whenever the context is appropriate. Finally, let 
those basic potential functions be fixed as some family (fi)ier, where f;(0) = 0. 
We define the potential represented with P using the function ¢ where 
b(n, P) = Xi pi filn) . 
The function ¢ yields the total potential on a list (excluding the potential of its 
elements) as a function of the list’s size n and its potential annotation P. 
We can then relate resource potential between different sizes of list with the 
shift operator < : Q7 — Q7 and constant difference operator 6 : Q’ > Q. These 
functions need only satisfy the following property equation. 


oln +1, P) = (P) + o(n, <P) (1) 


Though we leave open the explicit definition of these functions for generality, 
we only later work with instances of them that are linear operators, such that 
Equation 1 denotes a linear recurrence. Such a refinement leaves <P and 6(P) 
linear functions of P. 

These functions come in handy for understanding the stored potential in a 
value of a given type, defined by the potential function ® as follows. 
(v : basic) = 0 
P((v1, v2) $ Ay x Ao) = (v : Aı) + (vo : Az) 
®([]: L?(A)) =0 
@(h::t: LP(A)) = 6(P) + 8(h : A) + 8(t : LI? (A)) 


We often need to measure the potential across an entire evaluation context 
of typed values V : I’ given by a typing context I" and variable bindings V. We 
do so by extending the definition of potential @ as follows. 


&(0) =0 (V :(I,v: A)) =O(V : I) 4+ (uv: A) 

Finally, we can use these definitions to obtain a closed-form expression for 
the potential over an entire list (including its elements) with the following: 
Lemma 1. Let | = [an,...,a:] be a list of n values. Then ®(l : L?(A)) = 
o(n, P) + Xiz Plai : A) 

Proof. We induct over the structure of the list J. 

For the empty list of length 0: 


S(]] : L?(A)) =0 = pi - fi(0) = 60, P) + Z2, 2(a; : A) 
For l = h :: t of size n+ 1: 


Planı b: LP(A)) = 6(P) + Gans, : A) +U : LIP (A)) 
= 6(P) + S(anşı : A) + O(n, IP) + OP, G(a; : A) 
= o(n + 1, P) + 57t @(a; : A) 
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We can apply Lemma 1 to the previously defined function snoc to see the 
change in potential between input and output. This difference in potential should 
bound the resources consumed. For this case, the basic potential functions (fi) 
only need contain An.n, and we can let <(p) = p = 6((p)). Letting y be the 


result of snoc x xs, the type Z x L1(Z) 1⁄9 L? (Z) indicates the following bound 
P(x: Z, xs: L'(Z)) +1 -— 8(y : L°(Z)) = $(\xs|,1) +1— o([yl,0) = |zs| +1 


This is exactly the amount of resources consumed, so the bound is tight. 

In this work we only consider so-called univariate potential, wherein every 

term in the potential sum is dependent on the length of only one input list. How- 
ever, different univariate potential summands may depend on different inputs, 
and thus univariate potential may still be multivariate. The term multivariate 
potential refers to using more general multivariate functions for potential. There 
is existent work on multivariate potential using polynomial functions [24]. We 
expect that the work here extends to multivariate potential similarly. 
Typing Rules The typing rules in Figure 2 use the judgment X; I ra e: A. In 
this typing judgment, I’ maps variables to types, while X maps function labels 
to sets of types. This judgment holds when, in the typing environment given by 
X and I’, the expression e is of type A, subject to the constraints that q and 
qd are the amount of available resources before and after some evaluation of e. 
Unlike the judgment V F e |) v | (g,q’), these values need not be tight. 

By expressing available resources on the turnstile, and potential resources 
in the types given by X, I, and A, the type system is set up to formalize the 
reasoning of the potential method. Theorem 1 shows that it is sound with respect 
to the operational semantics of Section 2. 

Many typing rules preserve the total resource potential they are given, con- 
suming none of it themselves. They therefore usually either have no explicit 
interaction with potential (e.g. Lit) or pass around exactly what they are given 
(e.g. Let). All basic rules in the first block of Figure 2 fit this characterization. 

The typing rules concerning functions in second block of Figure 2 are the 
only to make use of X. For each function f defined in prog via func{ f}(a.e), 
X(f) refers to the set of types that its body e could be given. That we allow for 
sets of types is important because recursive calls to a function may not always 
make use of a type with the same resource annotations; this is called resource- 
polymorphic recursion. Despite these rules capturing the intuition behind typing 
resource-polymorphic recursion, they are not used in existing implementation, 
as they lead to infinite type derivations. Nonetheless there exists an effective way 
to type resource-polymorphic recursion with a finite derivation; see [26]. In the 
examples provided in this article, it usually suffices to consider only resource- 
monomorphic recursion, wherein inner and outer calls use the same annotation. 

All of the rules discussed so far are simply those of existing AARA literature 
with their parameter for operation cost set to 0 (see e.g. [27]). This does not 
change their generality, as such constant cost can (and could already in prior 
work) be simulated using tick. Similarly, non-constant costs could be simulated 
by running helper functions using tick the appropriate number of times. 
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Fig. 2. AARA typing rules. 


Basic rules: 


z: Ty Hee: A X; x: A Hr e: B 


0 Lit q Let 
5:0 by lit : basic X; Ii, I2 ra let(e1;x.e2): B 
5 Unop 5 Binop 
X; x : basic H unop(x) : basic’ X; xi : basic H binop(x1, £2) : basic’ 
a a Var 0 Pair 
Sia:Aly «:A X; zı : 1,22 : A2 be pair(x1, £2) : Ar x Ao 
X; T, x1 : Á1, £2 : Ao Hr e:B 
q PMat 
X; T, x: Aı X Ag ra pairMatch(x;21,22.e): B 
E; T,z : bool Hr ey: A D; Tx : bool Hr ez: A 
q Cond 
X; T, x : bool te cond(x;e1;e2): A 
Function rules: 
1 q 
AI B E X(f) P func{f}(x.e) € prog X;x: A ra e:B au 
pp 1 
q 4/4 
X;x: A K7 app{f}(2) : B A> BEXT) 
Potential-focused rules: 
. 5r Hr e:A q>p q-p>q-p' 
mazx(r, 0) Tick q Relax 
DT Fmas( r07 tick{r} : unit DT hee e:A 
U;0,0:AEr e:B Al<:A pr ir eA Al<:A 
SubWeakL SubWeakR 
U;0,a:A' Er e:B srr e:A 
X; T, £2 : A2, £3 : Ag ra e:B Aj Y (Ag, A3) 
E Sharing 
X; T, x1 : Ai Rr share(x1; £2, £3.e) : B 
List rules: 
Nil Cons 


ô(P 
50 H nil: LP (A) Dap: A, xt : LIP (A) ee cons(ap; xt) : LP (A) 


q+ô(P 
DT Kir e1:B X;I,@p: A,x: LIP (A) He) e2 : 


X; T, æ: L? (A) Hr- listMatch(zx; e1; £p, £t.€2) : B 


ListMatch 
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Fig. 3. AARA subtyping and sharing judgments. 


Vi.pi = qi Subi 
aE a ke eee U e . 
LP(A) <: Le (A) a basic Y (basic, basic) ShareBasic 
Ai Y (A2,A3) Bı Y (B2, B Ai Y (An As) P=Q+R 
i — SharePair P R ShareList 
RR E LP (A1) Y (EO (Az), LË (A3) 


The remaining rules cover sharing, subtype-weakening, and the rules con- 
cerning lists. Weakening, though not listed, is also allowed. 

Sharing is a form of contraction. By sharing, the rest of the typing rules 
can become affine, allowing only single usages of a given variable. Intuitively, 
sharing is meant to prevent duplicating potential across multiple usages of a 
variable, and instead split the potential across them. The rules for the sharing 
judgment, indicating how to split potential, can be found in Figure 3. Note that 
the rule ShareList adds indexed collections of rationals; this should be interpreted 
pointwise, as if the addends were vectors or matrices. 

Subtype-weakening is a form of subtyping based on potential. It discards 
potential on a list, weakening the upper bound on resources it represents. This 
rule follows all usual subtyping rules, as well as Subtype from Figure 3. Relaxing 
behaves similarly, but loosens the bounds on the available resources instead. 

The intuition for the rules concerning lists in the last block of Figure 2 is that 
total resources should be conserved between constructions and destructions. Be- 
cause 6(P) expresses the difference in potential, it is exactly how many resource 
units are released after a pattern match on a list of type L? (A). For the same 
reason, it is also how many need to be stored when reversing the process and 
putting an element on a list of type LIP (A). Finally, when a list is empty, it 
has no room to store potential. Every potential function f; maps 0 to 0, so an 
empty list can safely be assigned any scalar of zero potential. 


Soundness The soundness of the type system is expressed with the following 
theorem. It states that the evaluation of an expression e does not require more 
resources than initially present, and (should evaluation terminate) it leaves at 
least as many resource as dictated. The proof is a straightforward generalization 
of the version from [27], but we nonetheless reproduce the proof below. 


Theorem 1. Let X; I Lr e: B and V provide the variable bindings for I’ 

1. fV Āe} v](p, p) thenp< &V:I)+¢q andp—p' < &V:T)+q-H(u: 
B)- q 

2. 1fVtelo|p thenp<@&(V:r)+q 


Proof. Assume V binds I’s variables and perform nested induction on the type 
derivation and operational judgment for an expression in let-normal form. We 
show the induction below only for the terminating operational judgment cases, 
but the partial-evaluation cases are nearly identical. 

(Base Non-Cons) Suppose the last rule applied in the typing derivation is 
any non-Cons base case, i.e., Lit, Var, Unop, Binop, Pair, Nil, or Tick. Then 
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assume the appropriate terminating operational judgment rule applies. In such 
a case, one finds p < q, p > q', and (v : B) = (V : T). This and the 
non-negativity of potential are sufficient to satisfy the desired inequalities. 

(Base Cons) Suppose the last rule is Cons, so q = 6(P) and q’ = 0. Assume 
the Cons operational judgment applies, so that p = p' = 0. Note ®(up :: v : LE) 
is equal to 6(P) + B(up, : A) + (uy : L? (A)) by definition. This identity and the 
non-negativity of potential satisfy the desired inequalities. 

(Step Implicit Inequalities) Suppose the last rule is one of SubWeakL, 
SubWeakR, Relax, or substructural weakening, and assume some operational 
judgment applies. Each typing requires a similar typing judgment as a premiss. 
Further, none changes any values, so the same operational judgment still applies. 
Thus, the inductive hypothesis applies, and gives almost the inequalities we need. 
Each case provides the inequalities needed to finish. For subtype-weakening, it 
is sufficient note that C <: D entails (v : C) > (v : D), since C is pointwise 
greater-then-or-equal to D. For relax, the premisses of the relax rule directly 
include the inequalities needed to complete the case. And we can complete the 
substructural weakening case by noting that the non-negativity of potential en- 
tails OV: T,v: A) > 8(V:T). 

(Step Let) Suppose the last rule is Let, and suppose its operational judg- 
ment applies. The premisses of the typing rule require that X; I H eı : A and 
33,19,":A ra e2 : B. The premisses of the operational judgment require that 
V F e 4v | (s,s) and Via v] F es 4 vo | (t,t), where p = s+ maz(t— s’,0) 
and p' = t + maz(s' — t, 0). Applying the inductive hypothesis to these premiss 
pairs and adding the resulting inequalities cancels terms to complete the case. 

(Step Sharing) Suppose the last is Sharing, so that I = I”, xı : Ay. It re- 
quires as a premiss that X; I”, £2 : A2, £3 : Á3 Lar e: B, where A; Y (Ag, A3). As- 
suming the operational judgment Share applies, V [x2 œ> V (z1), z3 œ> V(a1)] F 
e |} v | (p,p’) also holds. The inductive hypothesis applies, yielding the needed 
inequalities, but for x2, x3 instead of xı. However, the sharing relation ensures 
that (vı : Ay) = B(və : Ag, v3 : A3), and this identity finishes the case. 

(Step ListMatch) Suppose the last is ListMatch, so F = I',a : LP(A). 
There are two operational judgments which could apply: LMat0 and LMat1. 

Suppose the former judgment applies. It requires that V F e1 4 v | (p,p’). At 
the same time, the ListMatch rule requires as a premiss that X; I” Lr ey: B. 
The inductive hypothesis applies, yielding the needed inequalities, but for T” 
instead of I’. However, because (nil : L? (A)) = 0, we see &(V : I’) = O(V:T), 
and the desired inequalities result. 

Suppose instead the latter judgment applies. This judgment requires as a 


premiss that V[ap, +> Uh, £t > v| F e2 J v | (p,p’). At the same time, the 
+ 6(P 
ListMatch rule requires that X; I”, £p : A,a, : LIP (A) ae e2 : B. The 


inductive hypothesis applies, telling us that p — p' < (V : I', Un : A, v : 
LIP (A)) +q +8(P)—8(v : B)—q' and p < (V : I", vp : A, vi : LIP (A)) +q + 
6(P). By definition, (vp, :: vy : LP) = 5(P) + O(up, : A) + (v; : LP(A)), and 
applying this identity to the inequalities yields the inequalities needed. 
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(Step Cond) Suppose the last rule is Cond, and that either of the CondT 
or CondF operational judgments apply. In either case, applying the inductive 
hypothesis to its premiss and the premiss of Cond gives the needed inequalities. 

(Step PMat) Suppose that the last rule applied is PMat, so that I = I”, «< : 
A, X Ag. This rule would require as a premiss that X; I”, xı : Aj, £2 : Ag Hr e: 
B, for e' the body of the match statement e. Suppose the PMat operational 
judgment applies. This judgment requires as a premiss that V|zı 4 v1, £2 > 
vo] F e J} v | (p,p’), where the value of x is (v1, v2). Applying the inductive 
hypothesis to these premisses followed by the definitional identity ®((v1, v2) : 
A, X Ag) = B(vı : A1) + (v2 : Az) completes the case. 

(Step App) Suppose the last rule is App. Note that this rule requires Fun 
as a premiss, which in turn requires X; x : A ra e’: B where e’ is the body of 
the function being applied. If the App operational judgment applies, its premiss 
would require V |x" +> V (x)| Fe 4} v | (p,p’). Although e’ might not be a smaller 
expression than e, the operational judgment derivation still shrinks. This means 
the inductive hypothesis applies, and it gives the exact inequalities needed. 


Type Inference Type inference for the Hindley-Milner part of the type system 
is decidable [21,41]. The only new barrier for automating inference in AARA is 
obtaining witnesses for all the coefficients in each annotation P in a derivation. 

Each typing rule naturally gives a set of linear constraints on the entries of P. 
If the relation given by < and 6 can likewise be expressed with linear constraints, 
then all such constraints are linear. So long as |P| is finite, this forms a linear 
program. A linear program solver can then find minimal witnesses efficiently. 

Existing AARA literature (see e.g. [27]), however, uses binomial coefficients 
as the basis functions for P, of which there are infinitely many. This nonetheless 
works because only a particular finite prefix of their set, D Sieg G are used 
as a basis in a given analysis. Each such prefix basis also yields the same locally- 
definable shift operation: the linear equality <p; = pi + pi+1, where pp is the 
coefficient of (z) and is 0 if the function is outside the prefix. As this is a linear 
relation, and each prefix is finite, inference can be performed via linear program. 
The prefix bases of binomial coefficients thereby form an infinite family of finite 
bases, each of which allows automated inference of resource polynomials up to 
a fixed degree in the AARA system. 

As a caveat, not all programs use resources in a manner compatible with 
the AARA system. Indeed, it is undecidable whether or not a program uses e.g. 
polynomial amounts of resources, as this could solve the halting problem. 


4 Exponential Potential 


Stirling numbers of the second kind {7} = 4 37*_,(-1)*(*)(k — i)” count the 
number of ways to form a k-partition of a set of n elements. These can be used 
to express exponential potential functions similarly to how binomial coefficients 
can express polynomial ones. In particular, we make use of Stirling numbers with 
arguments n, k offset by 1, {771}, so that (n, P) = Ð, pi eae While other 
bases could also express exponential potential, these offset Stirling numbers have 
a few particularly desirable properties, which are described in this section. 
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Simple Shift Operation Like binomial coefficients, the prefixes of the basis of the 
offset Stirling numbers of the second kind form an infinite family of finite bases, 
each of which allows automated inference in the AARA system. However, these 
potential functions are exponential rather than polynomial. 

Stirling numbers of the second kind satisfy the recurrence { sae = (k+ 


1 {yii} + {i}. This recurrence allows the < operation to have the same local 
definition for every annotation entry in every prefix basis: <p; = (i+ 1)pi +Pi+1, 
where px is the coefficient of EL and is 0 if the function index is outside 
the chosen prefix. Given this definition for < and letting 6(P) = po, we find 
pot+t>.; <p =) pnih satisfying Equation 1. 

This shift operation yields a linear relation, as the coefficient of a given p; is 
a constant scalar. Thus, exactly like when using binomial coefficients, inference 
is automatable via linear programming. Certain other exponential bases, like 
Gaussian binomial coefficients, could be similarly automated. 


Expressivity Because {Pti} = 4 Sf (@+1)” € O((k + 1)”), the 
offset Stirling numbers of the second kind can form a linear basis for the space of 


sums of exponential functions. Each function An.b” with b > 1 can be expressed 
as a linear combination of the functions mA. 


The function Ani} is also non-negative for natural n, and non-decreasing 
with respect to n. These are two natural properties to require of basic potential 
functions, since amortized analysis requires non-negative resources, and larger 
inputs should not usually become cheaper to process. Further, the properties 
are preserved by non-negative linear (i.e. conical) combination, and by < when 
defined with a non-negative linear recurrence - the combinations given by P and 
<P always satisfy the two potential function properties. 

Ensuring these properties for more general potential functions requires de- 
termining if such a function on a natural domain is always non-negative. This is 
non-trivial. In the existing literature on multivariate polynomials, we find this 
is undecidable in the worst case [40]. However, restricting to non-negative lin- 
ear (that is, conical) combinations of non-negative, non-decreasing functions - 
as we have done here - gives simple linear constraints that ensure both desired 
properties. For finite bases, this is easily handled via linear programming. 

When considering expressivity in this conical combination model of potential 
functions, one finds some otherwise-valid potential functions are not be express- 
ible in the conical space given by the offset Stirling number functions. Nonethe- 
less, Stirling number functions are a maximally expressive basis; it is not possible 
to express additional potential functions using a different basis without losing 
expressibility elsewhere. Notably, the standard exponential basis is not maximal 
in this sense. The formal statement of such maximal expressivity is generalized 
in the theorem below. Any finite, sequential subset of the offset Stirling number 
functions satisfy the prerequisites of this theorem, as do the binomial coefficient 
functions and other well-known functions like the Gaussian polynomials. 


Theorem 2. Let {f;} be a finite set of linearly independent functions on the 
naturals that are non-negative and non-decreasing. Let fi(n) be 0 until n > i, 
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and leti < j imply that O( fi) C O(f;), with asymptotic equality only when i = j. 
Let L be the linear span (collection of linear combinations) of {fi}, and let C be 
its conical span (collection of conical combinations). 

There does not exist another linearly independent basis {g;} with linear span 
L and conical span D 2 C such that each function in {gi} is non-negative and 
non-decreasing. That is, {fi} has a maximally expressive conical span. 


Proof. Suppose there is such a basis {g;}. We express each basis {f;} and {gi} 
with linear combinations of the other, and derive a contradiction. 

If there is any function in the conical span D of {g;} that is not in C, then 
this is the case for some basis function g,. Because gẹ € L, it can be written as 
a linear combination of { fi}; let 30; a: fi = gk. Because gx ¢ C, there is at least 
one coefficient a; < 0; let it be a,,. In case there are multiple candidate elements 
gk, pick gx, to be the basis function such that this index m is minimized. 

We then see that 9x(m) = J aifi(m) = (icm ai film)) + Omfm(m) be- 
cause f;(m) for i > m is 0. This yields two observations: First, m < k, as 
otherwise the fastest-growing term of gẹ would be negative, but gẹ is never neg- 
ative. Second, the term Qmfm(m) is negative, yet gx > 0, so it must be that 
S icm %fi(m) > 0. Thus there exists a coefficient a, > 0 where p < m. 

Now we look at representing {f;} with {g;}. Because the conical span D 
contains C’, it can represent each f; as a conical combination. Notably, a given 
fi cannot be represented only with functions outside of Q(f;), nor any function 
outside of O(f;), due to growth rates. There is therefore at least one function 
in {g;} that is O(f;), for each i. Since the linear span of these corresponding g; 
already has the same (finite) dimension as L, any additional functions would not 
be linearly independent. Due to this, we can say gi E€ O( fi) uniquely for each i. 

Take f, in particular as a conical combination of {gi}. We now consider 
replacing each element of {g;} in that conical combination with its equivalent 
linear combination of elements of {f;}. Because of the above correspondence of 
growth rates, there must be a positive coefficient for g,. Because gẹ has positive 
weight a, on fp where p < m < k, another basis function g; in the conical 
combination must have negative weight on fp to cancel it out in their linear 
combination. However, gẹ was picked such that it had the lowest index m with 
negative weight across all {g;}; it is contradictory for there to be such a p < m. 


Natural Semantics The values of mi } count the number of ways to pick k non- 


empty disjoint subsets of n elements. Many programs with exponential resource 
use iterate over collections of subsets, so these numbers naturally arise. 

Recall the naive solution to subset sum from the introduction. The algorithm 
iterates through all the subsets of numbers in the input list. When considering 
Fagin’s descriptive complexity result that NP problems are precisely those ex- 
pressible in existential second order logic [14], it becomes clear that naive solu- 
tions to any NP-complete problem fit this characterization: naively brute-forcing 
through second order terms to find an existential witness is just iterating through 
tuples of subsets. 
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Example Consider the naive solution to subset sum from the introduction. One 
can verify that the number of Boolean and arithmetic operations used on an 
input of size n is 3 x 2” — 2 by induction. We find the same bound here by 
preceding each such operation with an explicit tick{1} operation. Thee AARA 


type system then verifies that the type of subsetSum is L°(Z) x Z 1/9 bool. 

Here is the code again, with type annotations on each line tracking the 
amount of Jar potential on lists, and comments tracking available constant 
potential. For clarity, the code is re-written in a let-normal form, and sharing 
locations are marked. 


let subsetSum nums:L°(Z) target = (* 1 *) 
match nums:L°(Z) with 

| > (* 1 *) 

tick 1; target = 0 (* 0 *) 

| hd::(t1:L°(Z)) > (* 4 *) 

tick 1; let newTarget = target - hd in (* 3 *) 


(* share t1:L°(Z) as L3(Z), L3(Z) *) 

let withNum = subsetSum t1:L°(Z) newTarget in (* 2 *) 
let without = subsetSum t1:L°(Z) target in (* 1 *) 
tick 1; withNum || without (* 0 *) 


The indicated values yield witnesses for the AARA typing rules, so we know 
via soundness that the difference between initial and ending potential gives an 
upper bound on how many operations were used. That difference is 1+3x ara = 
3x2” — 2, where n is the size of nums, exactly the amount used. 

Exponential terms with higher bases than 2 can come into play with more 
recursive calls, like in the code below enumerating the 3” ways to put n labelled 
balls into 3 labelled bins. 


let helper xs: L”? (Z) a b c = (* 1 *) 
match xs with 
| > (* 1 *) 
tick 1; [(a,b,c)] (* 0 *) 
| hd::(t1:L%°(Z)) > (* 3 *) 
(* share t1:L°°(Z) as L??(Z), L??(Z), L>?(Z) *) 
let newA = hd::a in (* 3 *) 
let tmp1 = helper t1:L77(Z) newA b c in (* 2 *) 
let newB = hd::b in (* 2 *) 
let tmp2 = helper t1:L77(Z) a newB c in (* 1 *) 
let newC = hd::c in (* 1 *) 
let tmp3 = helper t1:L7?(Z) a b newC in (* 0 *) 
tmpi © tmp2 @ tmp3 (* 0 *) 
let ballBins3 xs:L77(Z) = (* 1 *) 
helper xs:27°(Z) 0 0 0 (* 0 *) 


By paying a unit of resource for each such way using tick, we can use AARA 
to bound the count. It assigns a type of L>?(Z) = D°9(L99(Z) x £9(Z) x 
L°-°(Z)) to ballBins3, where the superscript tracks {"7"} and Jarri potential, 
respectively. Since rs ie + ae +1 = 3”, this bound is exact. 
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5 Mixed Potential 


It is possible to combine the existing polynomial potential functions with these 
new exponential potential functions to not only conservatively extend both, but 
further represent potentials functions with their products. This space represents 
functions in @(n*(b+ 1)”) for naturals k,b, and does so with terms of the form 
o) Ka so that (n, P) = X'e p Pbk ` Case Note that for k or b equal to 
0, the potential functions here reduce to the offset Stirling numbers or binomial 
coefficients, respectively. 

The methods used to combine these potential functions here can easily be 
generalized to combine any two suitable sets. 


Simple Shift Operation It is straightforward to find a linear recurrence for these 
products by distributing over their linear recurrences. 


(Ret) Lote Fa (Gh) +) O42) {0 FP) 
DEEE DOGE EPOCH 
As before, this yields a definition for 6 and < with Equation 1. Letting P now 
be indexed by pairs b, k: <Pb k = (b + 1)po k + (b + 1)Po,k+1 + Po+1,k + Po+1,k+1, 


and 6(P) = po,ı +p1,0 +p1,1. Noting that these definitions are linear again yields 
automatability for finite (2-dimensional) prefixes of the basis. 


Expressivity The product of non-negative, non-decreasing functions is still non- 
negative and non-decreasing, so products of valid potential functions are still 
valid. Soundness is preserved by letting po be shorthand for the new constant 
function coefficient poo wherever it is used in Theorem 1. Moreover, maximality 
of expressivity is preserved, simply by giving index pairs the ordering relation 
(i1, i2) < (ja, j2) < ti < Ji A 19 < J2 and applying Theorem 2. 


Example Consider bounding the number of Boolean and arithmetic operations 
in a variation of subset sum: single-use subset sum. Here the input may contain 
duplicate numbers that should be ignored, so as to treat the input as a true set. 
This is a trivial change to the mathematical problem, but one that real code 
might have to deal with, depending on the implementation of sets. 

The code can be changed to handle this by removing all later duplicates of 
each number it reaches, so that later recursive calls will never see the number 


again. It is easy to create a function remove of type Z x L¢*+15-*(Z) ug L2.<(Z) 
to do this for any a,b, c,d, where the superscript values represent linear, m 
and ngog potential, respectively. 

One can prove by induction that at most 4* 2" — n — 3 Boolean or arithmetic 
operations are required. Although this can be bounded with only exponential 
functions, the purely exponential potential system cannot reason about the exact 
(linear) cost associated with remove, and overestimates the bound to be in 0(3”). 
This mixed system can provide a better (though still loose) bound of n2” + 2 x 


2” — n — 1, giving a type of L°?1(Z) x Z L9 bool to subSuml. After showing 
this derivation, we will show how to find the exact bound with AARA. 
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The following is the single-use subset sum code, with comments on each line 
tracking the amount of available resources on each line. For clarity, we indicate 
sharing and subtype-weakening locations. 


let subSum1 nums:L°'(Z) target = (* 1 *) 
match nums with 

| 1] > (* 1 *) 

tick 1; target = 0 (* 0 *) 

| hd::(t1:L'%?(Z)) > (* 4 *) 

let otherNums: L°%?(Z) = remove hd t1:L'°?(Z) in (* 4 *) 

tick 1; let newTarg = target - hd in (* 3 *) 


(* weaken otherNums:L°°?(Z) to L°*?(Z) *) 

(* share otherNums:L°?(Z) as L°?'(Z), L%?1(Z) *) 

let withNum = subSumi otherNums:L°*)'(Z) newTarg in (* 2 *) 
let without = subSumi otherNums:L°?'!(Z) target in (* 1 *) 
tick 1; withNum || without (* 0 *) 


The difference between initial and ending potential gives the upper bound of 
1+ as} +n ag | = n2” +2 x2” — n — 1 Boolean or arithmetic operations. 

Note that we use the subtype-weakening rule, throwing away 2 units of p 
potential. This indicates why the bound is not tight. Next we show how to 
improve this bound using potential demotion. 


Demotion There is one special exception to the non-negativity of potential an- 
notations that may be added due to the particular nature of the relation between 
binomial coefficients and Stirling numbers. It represents the concept of demoting 
exponential potential into polynomial potential. 

The relevant relation is kE =O" l= yl) = De ("). This allows 
a unit of fore potential to account for one unit each of all non-constant bi- 
nomial coefficient potentials. We can express this with the following additional 
subtyping rule. In this rule we interpret the 2-dimensional indexing of the poten- 
tial annotation as a matrix, and we let pi refer to the vector of potential entries 
at index coordinates 0,7 for i > 1. 


0 Pts«T 
r—s 0 
LP (A) <: L? (A) 


op 


P= 
R+ are 


| Q=R+ 


Demote 


Theorem 3. The demotion rule is sound. 


Proof. We need only show that C <: D implies (vu : D) < (v : C) for un- 

changed values v. The rest of soundness then follows as in Theorem 1. To do so, 

it is sufficient to show for l = [a1,...,an] we have ®(a : LE (A)) < ®(a: LP (A)). 
Without loss of generality, we need only consider where R = 0. 


@(1 : LE(A)) =d(n,Q) + X; O(a : A) 
=P) j+ Oa +9) G) + OG: A) 


Exponential AARA 375 


55ra — 8)(2) + De Bia + 9)(0) + DL O(a : A) 
yrs Saal) t tas) 

=r{"}"} + Si Pi) +0", 2(a; : A) 

=$(n, P) + 0" O(a: A) = OL: LP(A)) 


As acorollary, this allows us to loosen the constraint that every annotation P 
contains only non-negative rationals. In particular, it is no longer required that 
Vi.po,i > 0. Instead, we require that Vi.po,;+ p10 > 0. Each unit of PN poten- 
tial may “pay” for one unit of deficit from each polynomial potential function. 
Because this is still a linear constraint, type inference remains automatable. 

Using Demote, tighter bounds can be obtained. Consider the single-use subset 
sum solution from the previous section. Here it is again below, but this time 
allowing the linear potential to be paid for by Jere potential. AARA can now 


provide a type of L~14°(Z) x Z WP bool for subSuml, corresponding to the 


exact upper bound of 4 x 2” — n — 3 operations. This time n * {et is elided in 
the annotated potentials, as it is not needed. 


let subSum1 nums:L~'*(Z) target = (* 1 *) 
match nums with 

| J > (* 1 *) 

tick 1; target = 0 (* O *) 

| hd::(t1:L7'8(Z)) > (* 4 *) 

let otherNums: L~*8(Z) = remove hd t1:L7'°(Z) in (* 4 *) 

tick 1; let newTarg = target - hd in (* 3 *) 


(* share otherNums:L~*°(Z) as L~'“(Z), L~*4(Z) *) 

let withNum = subSumi otherNums:L~'“(Z) newTarg in (* 2 *) 
let without = subSumi otherNums:L~'“(Z) target in (* 1 *) 
tick 1; withNum || without (* 0 *) 


The difference between initial and ending potential gives the upper bound of 
t—-n+4{"t"} = 4*2” — n — 3, as desired. 


6 Exponentials, Polynomials, and Logarithms 


The addition of exponential potential also allows for the inference of previously 
nonderivable polynomial-resource types for certain programs. One such way this 
can happen is by compacting the potential of a list into a new list logarithmic 
in size to the first. Performing exponential-cost operations, such as subsetSum, 
on a list of logarithmic size only has linear cost in total. 

In the code below, log takes a list x of length n and returns a list of length 
roughly logə(n). If x begins with one unit of linear potential, the type system 
assigns the output of log one unit of base-2 exponential (2" — 1) potential. We 
show in the code below with types of the form L®?, where a is the linear potential, 
and b is the base-2 exponential potential. This lets us find that half can have 
type L+? (Z) 9 L?-°(Z) and log has type L+? (Z) UN L™1(Z). The typing of log 
shows the conversion from linear to exponential potential. 
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let half x: L'°(Z) = (* 0 *) 
match x with 

| fl > (* 0 *) 

0: £7°(Z) (* 0 *) 

| hd:: (t1: L1°(Z)) > (* 1 *) 

match tl with 

| (] > (* 1 *) 

(1: L°(Z) (* 1 *) 

| hd2::(t12: L'°(Z)) > (* 2 *) 

let halfTail: L?°(Z) = half t12 in (* 2 *) 

(hd: :halfTail): L7°(Z) (* 0 *) 

let log x: L'°(Z) = (* 0 *) 
match x with 

| fl > (* 0 *) 

0: L (Z) (* 0 *) 

| hd::(t1: L>°(Z)) > (* 1 *) 

let halfTail: L*°(Z) = half tl in (* 1 *) 

let subSoln: L™?(Z) = log halfTail in (* 1 *) 

(hd: :subSoln): L™! (Z) (* O *) 


Typing log above requires resource-polymorphic recursion. However, this can 
be justified by noting that the above can be thought of to show half has type 


L? (Z) HN L?%°(Z) and log has type L? (Z) 2/9 L°*(Z) for any a > 0. 


Coincidentally, log conversion of linear to exponential potential certifies that 
the output list’s size can be bounded by a logarithm of the input’s size. Nonethe- 
less, logarithmic potential is not directly compatible with the approach this work 
takes. Sublinear functions have negative second derivatives, and this yields neg- 
ative annotation entries under < applications. This may not be insurmountable, 
as the demotion rule showed here, but new ideas are needed overall. Logarithmic 
potential has been explored in [32], though the approach there departs from the 
automatable AARA framework of linear constraint solving. 


7 Conclusion and Future Work 


Using Stirling numbers of the second kind allows for the automated inference of 
exponential resource usages via Automatic Amortized Resource Analysis. This 
may be combined with the existing polynomial system, allowing mixtures of 
polynomial and exponential functions to be inferred. Under this system, more 
kinds of programs can now be automatically analyzed, in particular those making 
use of multiple recursive calls, or logarithmically-sized lists. Finally, the frame- 
work put in place to accomplish this separates the concerns of the type system 
and potential functions, paving the way to allow modular addition of different 
potential functions. Future work could extend the work here to cover additional 
language features supported in polynomial AARA literature, like trees [22]. 
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1 Introduction 


Kleene algebra with tests (KAT) is a (co)algebraic framework [17,19] that allows 
one to study properties of imperative programs with conditional branching, i.e. 
if-statements and while-loops. KAT is build on Kleene algebra (KA) [6,16], the 
algebra of regular languages. Both KA and KAT enjoy a rich meta-theory, which 
makes them a suitable foundation for reasoning about program verification. 
In particular, it is well-known that the equational theories of KA and KAT 
characterise rational languages [27,21,16] and guarded rational languages [17] 
respectively. Efficient procedures for deciding equivalence have been studied in 
recent years, also in view of recent applications to network verification [3,8,28]. 

Concurrency is a known source of bugs and hence challenges for verifica- 
tion. Hoare, Struth, and collaborators [11], have proposed an extension of KA, 
Concurrent Kleene Algebra (CKA), as an algebraic foundation for concurrent 
programming. CKA enriches the basic language of KA with a parallel composition 
operator - || -. Analogously to KA, CKA also has a semantic characterisation 
for which the equational theory is complete, in terms of rational languages of 
pomsets (words with a partial order on letters) [23,24,15]. 
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The development of CKA raises a natural question, namely how tests, which 
were essential in KAT for the study of sequential programs, can be integrated into 
CKA. At first glance, the obvious answer may appear to be to merge KAT with 
CKA, yielding Concurrent Kleene Algebra with Tests (CKAT) — as attempted 
in [12]. However, as it turns out, integrating tests into CKA is quite subtle and 
this naive combination does not adequately capture the behaviour of concurrent 
programs. In particular, using the CKAT framework of [12] one can prove that 
for any test b and CKAT program e: 


O <r b-e- b Lea € | (b i b) =xat € | 0 =æ 0 


thus b- e- b =cxar 0, meaning no program e can change the outcome of any test b. 
Or equivalently, and undesirably, that any test is an invariant of any program! 

The core issue is the identification in KAT of sequential composition - and 
Boolean conjunction ^. In the concurrent setting this is not sound as the values 
of variables — and hence tests — can be changed between the two tests. 

In order to fix this issue, we have presented Kleene Algebra with Observations 
(KAO) in previous work [13]. Algebraically, KAO differs from KAT in that 
conjunction of tests b A b’ and their sequential composition b- b’ are distinct 
operations. In particular, b\b’ expresses a single test executed atomically, whereas 
b-b’ describes two distinct executions, occurring one after the other. As mentioned 
above, this distinction is crucial when moving from the sequential setting of KA 
to the concurrent setting of CKA, as actions from another thread that happen 
to be scheduled after b but before b’ may as well change the outcome of 0’. 

This newly developed extension of KA enables a novel attempt to enrich CKA 
with the ability to reason about programs that also have the traditional condi- 
tionals: in this paper, we present Concurrent Kleene Algebra with Observations 
(CKAO) and show that it overcomes the problems present in CKAT. 

The traditional plan for developing a variant of (C)KA is to define a separate 
syntax, semantics, and set of axioms, before establishing a formal correspondence 
with the base syntax, semantics and axioms of (C)KA proper, and arguing that 
this correspondence allows one to conclude soundness and completeness of the 
axioms w.r.t. the semantics, as well as decidability of equivalence in the semantics. 
Instead of such a tailor-made proof, however, we take a more general approach 
by first proposing CKA with hypotheses (CKAH) as a formalism for studying 
extensions of CKA, akin to how Kleene algebra with hypotheses [5,18,20,7] can 
be used to extend Kleene algebra. We then apply CKAH to study CKAO, but 
the meta-theory developed can also be applied to extensions other than CKAO. 

Using the CKAH formalism, we instantiate CKAO as CKAH with a particular 
set of hypotheses, and we immediately obtain a syntax and semantics; we can 
then use the meta-theory of CKAH to argue completeness and decidability in a 
modular proof, which composes results about CKA [15] and KAO [13]. 

The technical roadmap of the paper and its contributions are as follows. 


— We introduce Concurrent Kleene Algebra with Hypotheses (CKAH), a for- 
malism for studying extensions of CKA; this is a concurrent extension of 
Kleene Algebra with Hypotheses (Section 4). We show how CKAH is sound 
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with respect to rational pomset languages closed under an operation arising 
from the set of hypotheses. We propose techniques to argue completeness 
of the extended set of axioms with respect to the sound model as well as 
decidability of equivalence, capturing methods commonly used in literature 
to argue completeness and decidability for extensions of (concurrent) KA. 
— We prove that CKAO can be presented as an instance of CKAH, for a certain 
set of hypotheses (Section 5). This gives us a sound model of CKAO ‘for free’. 
We then prove that the axioms of CKAO are also complete for this model, 
and that equivalence is decidable, using the techniques developed previously. 


We conclude this introduction by giving an example of how hypotheses can be 
added to CKA to include the meaning of primitive actions. Suppose we were 
designing a DSL for recipes, specifically, the steps necessary, and their order. A 
recipe to prepare cookies might contain the actions mix (mixing the ingredients), 
preheat (pre-heating the oven), chill (chilling the dough) and bake (baking the 
cookies). Using these actions, a recipe like “mix the ingredients until combined; 
chill the dough while pre-heating the oven; bake cookies in the oven” may be 
encoded as mix“ - (chill || preheat) - bake. Now, imagine that we have only one oven, 
meaning that we cannot bake two batches of cookies concurrently. We might 
encode this restriction on concurrent behaviour by forcing the equation 


(e-bake- f) || (g- bake-h) = (e-bake || g)-(f || bake-h) +(e || g- bake) -(bake- f || R) 
As a consequence of this hypothesis, one could then derive properties such as 
bake || (bake - mix) = bake - bake - mix + bake - mix - bake 


In a nutshell, this paper provides an algebraic framework — CKAH — together 
with techniques for soundness and completeness results. The framework is flexible 
in that different instantiations of the hypotheses generate very different algebraic 
systems. We provide one instantiation — CKAO — that enables analysis of 
programs with both concurrency primitives and Boolean assertions. This is the 
first sound and complete algebraic theory to reason about such programs. 

For the sake of brevity, some proofs appear in the extended version [14]. 


2 Preliminaries 


We recall basic definitions on pomset languages, used in the semantics of CKA, 
which generalise languages to allow letters in words to be partially ordered. We 
fix a (possibly infinite) alphabet X. When defining sets parametrised by X, say 
S(X), if X is clear from the context we use S to refer to S(X). 


Posets and Pomsets Pomsets [9,10] are labelled posets, up to isomorphism. 


Definition 2.1 (Labellet poset). A labelled poset over X is a tuple u = 
(S,<,A), where S is a finite set (the carrier of u), <u is a partial order on S 
(the order of u), and A: S + X is a function (the labelling of u). 
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We will denote labelled posets by bold lower-case letters u, v, etc. We write 
Su for the carrier of u, <u for the order of u, and A, for the labelling of u. We 
assume that any labelled poset has a carrier that is a subset of some countably 
infinite set, say N; this allows us to speak about the set of labelled posets over X. 
The precise contents of the carrier, however, are not important — what matters 
to us is the labels of the points, and the ordering between them. 


Definition 2.2 (Poset isomorphism, pomset). Let u,v be labelled posets 
over X. We say u is isomorphic to v, denoted u = v, if there exists a bijection 
h: Sy > Sy that preserves labels, and preserves and reflects ordering. More 
precisely, we require that Ay © h = Au, and s <u 8’ if and only if h(s) <y h(s’). 

A pomset over X is an isomorphism class of labelled posets over X, i.e., the 
class |v] = {u : u S v} for some labelled poset v. 


We write Pom(X) for the set of pomsets over X, and 1 for the empty pomset. 
As long as we have countably many pomsets in scope, the above allows us 
to assume w.l.o.g. that those pomsets are represented by labelled posets with 
pairwise disjoint carriers; we tacitly make this assumption throughout this paper. 

Pomsets can be concatenated, creating a new pomset that contains all events 
of the operands, with the same label, but which orders all events of the left 
operand before those of the right one. We can also compose pomsets in parallel, 
where events of the operands are juxtaposed without any ordering between them. 


Definition 2.3 (Pomset composition). Let U = [u] and V = [|v] be pomsets 
over X. We write U || V for the parallel composition of U and V, which is the 
pomset over X represented by the labelled poset u || v, where 


Au(z) 2€ Su 


Sulv = Su U Sy <ullv= Su U <v Aullv = 
l l Iv(£) E PEB, 


Similarly, we write U - V for the sequential composition of U and V, that is, 
the pomset represented by the labelled poset u - v, where 


Suv = YPullv Suv = <u U <v U (Su x Sy) Aus = Aullv 


Just like words are built up from the empty word and letters using concatena- 
tion, we can build a particular set of pomsets using only sequential and parallel 
composition; this will be the primary type of pomset that we will use. 


Definition 2.4 (Series-parallel). The set of series-parallel pomsets (sp- 
pomsets) over X, denoted SP(X), is the smallest set s.t. 1 € SP(X), a € SP(Z) 
for every a € X, and it is closed under parallel and sequential composition. 


The following characterisation of SP is very useful in proofs. 


Theorem 2.5 (Gischer [9]). Let U = [u] € Pom. Then U € SP if and only if 
U is N-free, which is to say that if there exist no distinct So, 81, $2, 53 E Sy such 
that So <u $1 and S2 Su 53 and So Su 83, with no other relation between them. 
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One way of comparing pomsets is to see whether they have the same events 
and labels, except that one is “more sequential” in the sense that more events 
are ordered. This is captured by the notion of subsumption [9], defined as follows. 


Definition 2.6 (Subsumption). Let U = [u] and V = [|v]. We say U is 
subsumed by V, written U E V, if there exists a label- and order-preserving 
bijection h : Sy + Sy. That is, Au o h = Ay and if s <y 8’, then h(s) <u h(s’). 


Subsumption between sp-pomsets can be characterised as follows [9]. 


Lemma 2.7. Let CP be C restricted to SP. Then LC is the smallest precongru- 
ence (preorder monotone w.r.t. the operators) such that for all U, V,W, X € SP: 


(U || V)-(W || X) GP? (U: W) || V-X) 


CKA: syntax and semantics. CKA terms are generated by the grammar 
efEeT 2) s=0]1)aeX|[et+ffle-flellf|eé 


Semantics of CKA is given in terms of pomset languages, that is subsets of SP, 
which we simply denote by 25. Formally, the function [—] : T— 25? assigning 
languages to CKA terms is defined as follows: 


[0] = 9 PI=( = le+fl=lel Ef] le: fl = lel - TI 
[e“J=[el” = fal=fa} [el Fl = iel II TS 


Here, we use the pointwise lifting of sequential and parallel composition from 
pomsets to pomset languages, i.e., when U,V C SP(X), we define 


U-V={U-V:UEU,VEV} Ul|V={U||V:UEU,VeEV} 


Furthermore, the Kleene star of a pomset language U is defined as U* =U enU", 
where U? = {1} and U”! =U" -U. 

Equivalence of CKA terms can be axiomatised in the style of Kleene algebra. 
The relation = is the smallest congruence on J (with respect to all operators) 


such that for all e, f,g € T: 
e+0=e e+e=e e+f=f+e e+(f+g)=(ftg)t+h 


nen 


e-(f:g9)=(e-f)g e(tgaefrek (e+f)-g=e:g+f:g 

e-l=ec=l-e e-0=0=0-e e|llf=flle e||l=e e||0=0 

ellFfig=ClfAig ellft+ta)=ellftellg 1+e-e* = e* =1+e*-e 
e+f:gSg = f“-eSg e+f:-gSf = eg Sf 


in which e < f is the natural order e+ f = f. The final (conditional) axioms are 
referred to as the least fixpoint axioms. 

Laurence and Struth [23] proved this axiomatisation to be sound and complete. 
A decision procedure was proposed in [4]. 
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Theorem 2.8 (Soundness, completeness, decidability). Lete, f € T. We 
have: e = f if and only if [e] = [|f], and it is decidable whether Je] = [f]. 


Readers familiar with CKA will notice that the algebra defined here is not in 
fact CKA as defined in [11]. Indeed the signature axiom of CKA, the exchange law, 
has been omitted. However, as we show in Section 4.2, the standard definition of 
CKA, as well as its completeness proof [15], may be recovered using hypotheses. 


3 Pomset contexts 


The linear one-dimensional structure of words makes it straightforward to define 
occurrences of subwords: if one wants to state that a word w appears in another 
word v, one can simply say that v = xwy for some x and y. Due to the two- 
dimensional nature of pomsets, it is not straightforward to define when a pomset 
occurs inside another pomset, because the pomset could appear below a parallel, 
which is nested in a sequential, which is in a parallel, etc. In what follows we 
define pomset contezts, that will enable us to talk about pomset factorisations in 
a similar fashion as we do for words, and prove some useful properties for these. 


Definition 3.1. Let x» be a symbol not occurring in X. A pomset context is a 
pomset over XU {x} with exactly one node labelled by x. More precisely, C is a 
pomset context if C = |c] with exactly one są E€ Se with Ac (s.) = *. 


Intuitively, x is a placeholder or gap where another pomset can be inserted. 
We write PC(X) for the set of pomset contexts over X, and PC*?(5’) for the 
series-parallel pomset contexts over X. 

Given a C € PC and U € Pom, we can “plug” U into the gap left in C to 
obtain the pomset C[U] € Pom. More precisely, let U = fu] and C = [c] with 
u disjoint from c. We write C[U] for the pomset represented by cfu], where 
Seju] = Su U Se — {*} and Agiuj(s) is given by Ac(s) if s € Se — {*}, and Au(s) 
when s € Sy; lastly, Sefu] is the smallest relation on Seju] satisfying 


5 Lu s S Le s Sx Le S S E Sa S E Su S Le Sx 


/ 


S Sefu] s S Seful S s Seful S S Seful s 


It follows easily that <eju] is a partial order. We may also apply contexts to lan- 
guages: if L C Pom and C € PC, the language C[L] is defined as {C[U] : U € L}. 

We now prove some properties of contexts that will be useful later in our 
technical development. First, we note that pomset contexts respect subsumption. 


Lemma 3.2. Let C,D € PC, U € Pom. If CC D, then C[U] E D[U]. 


Series-parallel pomset contexts can be given an inductive characterisation. 
Lemma 3.3. PC* is the smallest pomset language L satisfying 


U € SP CEL CEL V eSP U € SP CEL 
x*EL U-CeL C-VeL U\||CeL 
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We will identify totally ordered pomsets with words, i.e., X* C SP. If the 
pomset U inserted in a context C is a non-empty word, and the resulting pomset 
is a parallel pomset, then we can infer how to factorise C. 


Lemma 3.4. Let C € PC? be a pomset context, let V,W € Pom, and let U € X* 
be non-empty. If C[U] = V || W, then there exists a C' € PC* such that either 
C=C" || W and CU] =V, or C =V || Cœ and C'[U] = W. 


Application of series-parallel contexts preserves series-parallel pomsets. 
Lemma 3.5. Let C € PC®. If U € SP, then C[U] € SP as well. 


If we plug the empty pomset into a context, then any subsumed pomset 
can be obtained by plugging the empty pomset into a subsumed context. If the 
subsumed pomset is series-parallel, then so is the subsumed context. 


Lemma 3.6. Let C € PC and V € Pom with V C C[1]. We can construct 
C’ € PC such that C’ E C and C'[1] = V. Moreover, if V € SP, then C’ € PC. 


An analogue to the previous lemma can be obtained if instead of the empty 
pomset one inserts a single letter pomset a. 


Lemma 3.7. Let C € PC, V € Pom anda € X with V E Cla]. We can construct 
C’ € PC s.t. C'E C and C"[a] = V. Moreover, if V € SP, then C’ € PC*. 


4 Concurrent Kleene Algebra with Hypotheses 


Kleene algebra has basic axioms about how program composition operators 
should work in general, and hence does not make any assumptions about how 
these operators work on specific programs. When reasoning about equivalence 
in a programming language, however, it makes sense to embed domain-specific 
truths about the operators into the axioms. For instance, if a programming 
language includes assignments to variables, then subsequent assignments to the 
same variable could be merged into one, giving rise to an equation such as 


sem<rten tem, (1) 


which says that the behaviour of first assigning n, then m to x (on the right) 
includes the behaviour of simply assigning m to x directly (on the left). 

Kleene algebra with hypotheses (KAH) [5,18,20,7] enables the addition of 
extra axioms, called hypotheses, to the axioms of KA. The appeal of KAH is that 
it allows a wide range of such hypotheses about programs to be added to the 
equational theory, while retaining the theoretical boilerplate of KA. In particular, 
it turns out that we can derive a sound model for any set of hypotheses, using the 
language model that is sound for KA proper [7]. Moreover, the completeness and 
decidability results that hold for KA can be leveraged to obtain completeness 
and decidability results for some specific types of hypotheses [5,20,7]; in general, 
equivalence under other hypotheses may turn out to be undecidable [18]. 
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In this section, we propose a generalisation of so-called Kleene algebra with 
hypotheses to a concurrent setting, showing how one can obtain a sound (pomset 
language) model for any set of hypotheses. We then discuss a number of techniques 
that allow one to prove completeness and decidability of the resulting system for 
a large set of hypotheses, by relying on analogous results about CKA. 


Definition 4.1. A hypothesis is an inequation e < f where e, f € T. When H 
is a set of hypotheses, we write =" for the smallest congruence on T generated 
by the hypotheses in H as well as the axioms and implications that build =. More 
concretely, whenever e < f € H, also e S” f. 


A hypothesis that declares two programs to be equivalent, such as in (1), can 
be encoded by including both e < f and f < e in H. 


Example 4.2. Suppose the set of primitive actions X includes the increments of 
the form incr x, as well as a statement print, which writes the complete state 
of the machine (including variables) on the standard output. Since we would like 
to depict the state consistently, the state should not change while the output is 
rendered; hence, print cannot be executed concurrently with any other action. 
Instead, when a program containing print is scheduled to run in parallel with an 
assignment, it must be interleaved such that the assignment runs either entirely 
before or after print. To encode this, we can include in H the hypotheses 


incr x || print = incr«- print + print - incr x 
for all variables x. This allows us to prove, for instance, that 
print -incra-incra- print S} (incr x || print)* 
That is, if we run some number of increments and print statements in parallel, 
it is possible that x is incremented twice between print statements. 


To obtain a model of CKAH, it is not enough to use [—], as some programs 
equated by the hypotheses might have different semantics. To get around this, we 
adapt the method from [7]: take [—]] as a base semantics, and adapt the resulting 
language using hypotheses, such that the pomsets that could be obtained by 
rearranging the term using the hypotheses are also present in the language: 


Definition 4.3. Let L C Pom. We define the H-closure of L, written L|", as 
the smallest language containing L such that for alle < f € H and C € PC®, 
if C[Lf]] © L4”, then C[]e]] C LL”. Formally, L} may be described as the 
smallest language satisfying the following inference rules: 
e<feH CePC® oeficr 
LENF C[fle]] € LY? 


Example 4.4. Continuing with H and » as in the previous examples, note that 
if L = [incr z || print], then incrz || print € L|”. Choose C = *; we have 
Clincr z - print] = incr z - print. Because incr x: print + print-incra < 
incr z || print € H and for all U € [incr z || print] we have C[U] € L C LL", 
we get Clincr z- print] € LJ” and therefore incr x: print € Ll”. 
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We observe the following useful properties about the interaction between 
closure and other operators on pomset languages. 


Lemma 4.5. Let L, kK C Pom and C € PC. The following hold. 

1. LC K)” if LE C KĮ”. 5. (L || K) 4 = (147 IELA 

2. If LC K, then LJ” C KĮ”. 6. (L*) 4E = (LEY j” 

a (oh = (LF UK,” 4" 7. If LIJ” C KI”, then C[L]}” CCR. 
4. (L; K)}” = (L}” - K}”) 4” 8. IF LC SP, then L|” CSP. 


Remark 4.6. Property (1) states that —|” is a closure operator. However, it is not 
in general a Kuratowski closure operator [22], since it fails to commute with union. 
For instance, let a,b,c € X and H = {a < b + c}; then {bY U{c}|” = {b,c}, 
while a € ({b} U{c}) |”. 


Using Lemma 4.5, we can show that, if we combine the semantics from [—] 
with H-closure, we obtain a sound semantics for CKA with hypotheses H. 


Lemma 4.7 (Soundness). Ife =" f, then Je]{” = [f]”. 


The converse of the above, where semantic equivalence is sufficient to establish 
axiomatic equivalence, is called completeness. Similarly, we may also be interested 
in deciding whether [e]|# and [f]{” coincide. 


Definition 4.8. Lete, f €T: 


(i) If elt” =f] implies e =" f, then H is called complete. 
(ii) If Je]L” = [f]L% is decidable, then H is said to be decidable. 


Note that, in the special case where H = Ø, we know that H is complete and 
decidable by Theorem 2.8. One method to find out whether H is complete or 
decidable is to reduce the problem to this special case. More concretely, suppose 
we know J[e]{” = [f]{”, and want to establish that e =" f. If we could find a 
set of hypotheses H’ that is complete, and we could map e and f to terms r(e) 
and r(f) such that [r(e)]1” = [r(f)] 47, then we would have r(e) =" r(f). If 
we could then “lift” that equivalence to prove e = =" f, we are done. eae if 
we would know that [r(e)]L” = [r(f)™ is equivalent to feJ} = [f, w 
could decide the latter. To formalise this intuition, we first need the following. 


Definition 4.9. We say that H implies H’ if we can use the hypotheses in H to 
prove those of H', i.e., if for every hypothesis e < f € H' it holds that e <¥ f. 


Implication relates to equivalence and closure as follows. 
Lemma 4.10. Let H and H’ be sets of hypotheses such that H implies H’. 
(i) Ife, f € T withe =" f, then e =" f. 


(ii) If L C Pom, then L}® CLL. 
(iii) If L C Pom, then (LL™){# = LL. 
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If H implies H’ and vice versa, then H is complete (resp. decidable) precisely 
when H’ is. In general, however, this is not very helpful; we need something more 
asymmetrical, in order to get from a complicated set of hypotheses H to a simpler 
set of hypotheses H’, where completeness or decidability might be easier to prove. 
Ideally, we would like to reduce to H’ = Ø, which is complete and decidable. 

One idea to formalise this idea of a reduction is as follows. 


Definition 4.11. Let H and H’ be sets of hypotheses such that H implies H'. A 
map r :T— T is a reduction ie H to H’ when both of the following are true: 


(i) fore € T, it holds that e =" r(e), and f , 
(ü) fore, f €T, if fell” =[/N7, then oN” = FAON. 


We call H reducible to H’ if there exists a reduction from H to H’. 


It is straightforward to show that reductions do indeed carry over completeness 
and decidability results, in the following sense. 


Lemma 4.12. Suppose H is reducible to H'. If H' is complete (respectively 
decidable), then so is H. 


Example 4.13. Let X = {a,b}. Let H = {a < b}. We can define for e € T the 
term r(e) € T, which is e but with every occurrence of b replaced by a + b. For 
instance, r(a-b* || c) =a- (a +b)* || c. An inductive argument on the structure 
of e shows that r reduces H to Ø, and hence H is complete and decidable. 


It is not very hard to show that reductions can be chained, as follows. 
Lemma 4.14. If H reduces to H', which reduces to H”, then H reduces to H”. 


Another way of reducing H is to find two sets of hypotheses Hp and H4, and 
reduce each of those to another set of hypotheses H’ [7]. The idea is that a proot 
of e =" f can be split up in a phase where we find e’, f! € Tsuch that e =" e’ 
and f =" f’, after which we find e”, f” € T with a = e"! and fi =" f". 
Finally, we establish that e” =”" f”, before lifting those equivalences to H, 


concludin 
5 e = e H e" Hop a id aT 


One way of achieving this is as follows. 


Definition 4.15. We say that H factorises into Ho and Hı if H implies both 
Ho and Hı, and for all L C SP we have that LJ” = (LL¥){™. 


In order to use factorisation to compose simpler reductions into more compli- 
cated ones, we need a slightly stronger notion of reduction, as follows. 


Definition 4.16. We say that r is a strong reduction from H to H' if it is a 
reduction such that for e € T, it holds that Je] = [r(e)]L” . 


Note that this additional condition essentially strengthens the second condition 
in Definition 4.11. Factorisation then lets us compose strong reductions. 
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Lemma 4.17. Suppose H factorises into Hj and Hı, and both Ho and Hı 
strongly reduce to H'. Then H strongly reduces to H’. 


The remainder of this section is devoted to developing techniques that can be 
used to design reductions, based on the properties of the sets of hypotheses under 
consideration. Using the lemmas we have established so far, these techniques may 
then be leveraged to obtain completeness and decidability results. 


4.1 Reification 


It can happen that the hypotheses in H impose an algebraic structure on the 
letters in X; for instance, as we will see later on, the letters in H could be 
propositional terms, whose equivalence is mediated by the axioms of Boolean 
algebra. In order to peel away this layer of axioms and reduce to a smaller H’, 
we can try to reduce to terms over a smaller alphabet, making the algebraic 
structure on the letters irrelevant to equivalence. In a sense, performing this 
kind of reduction is like showing that the equivalences between letters from the 
hypotheses can already be guaranteed by replacing them with the right terms. 


Example 4.18. Let X be the set of group terms over a (finite) alphabet A, that is, 
X consists of the terms generated by the grammar g,h::=u | ac A | goh | Gg. 
Furthermore, let =g be the smallest congruence generated by the group axioms, 
i.e., for all g,h,i € A it holds that 


go(hot) =a (goh) ot g°uUu=69=GUucg 9°9 =au=e9°gG 


Lastly, let group = {g < h : g =a h}. We can then define a reduction from group 
to @ by replacing every letter (group term) in a term e with its reduced form, 
that is, with the (unique) equivalent group term of minimum size. For instance, 
if A = {a,b,c}, then we send the term aoa || boco to the term u || b. 


For the remainder of this section, we fix a subalphabet l C X. When 
r: X => TTI), we extend r to a map from V(X) to I(T), by inductively applying 
r to terms. We can also apply r to a series-parallel pomset, obtaining a pomset 
language. More precisely, when U is a pomset, we define r(U) as follows: 


r) ={1} rU-Vy=rU)-rV) r@)=fr@)] rU lv) =r©@) || r(V) 


Lastly, when L C SP, we write r(L) for the set U{r(U) : U € L}. 
The following then formalises the idea of reducing by replacing letters. 


Definition 4.19. A map r : X —> T(T) is a reification from H to H' if 


(i) For alla € X, it holds that r(a) =" a. 

(it) r is expansive on I, i.e., for alla E€ I, a<r(a). 
(iti) H’-closure preserves I, i.e., for all L C SP(I), also L} C SP(L). 
(iv) For alle < f € H, it holds that r(e) S™ r(f). 
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Example 4.20. Continuing with the previous example, let r be the map that 
sends a group term to its reduced form; we claim that r is a reification from 
group to Ø. By definition, we then know that for a group term g € X, we have 
r(g) =a g, and hence r(g) =*°"P g. Furthermore, the reduction of a reduced term 
is that term itself; hence, the second condition is satisfied. The third condition 
holds trivially. Lastly, if e < f € group, then e, f € X such that e =g f. Since 
reductions are unique, we then know that r(e) = r( f), and hence r(e) S’ r(f). 


We have the following general properties of a map r, which we will use in 
demonstrating how to obtain a reduction from a reification. 


Lemma 4.21. Letr: X — T be some map. 


(i) For all C € PC®, we have r (C) C PC®. 
(ii) For all L C SP and C € PC*, we have r (C[L]) = Uper ey P Ir(L)]- 
(iii) For alle € T, it holds that r(ļe]) = [r(e)]. 


The following technical lemma is a consequence of property (iv). 
Lemma 4.22. Ifr is a reification and L C SP(X), then r(LL™) C r(L)l™. 
Using this, we can then show how to obtain a reduction from a reification. 


Lemma 4.23. If H implies H' and r is a reification from H to H', thenr is a 
reduction from H to H'. 


Proof. The first condition, i.e., that for e € T we have e =" r(e), can be checked 
using the first property of reification by induction on the structure of e. It thus 
remains to check the second condition; we do this by proving that for all e € X) 
we have r ([e]1”) = [r(e)] |. To this end, we derive as follows: 


r([elt”) c r(fe] 1” (Lemma 4.22) 
= [rN (Lemma 4.21 (iii)) 
c rron”) (property (ii)) 
C r([r(e)V”) (Lemma 4.10(ii)) 
= r([e]l”%) (property (i), soundness) 


Specifically, in the third step, property (ii) ensures that for L C SP(I’) we have 
L C r(L). We can use this property because H’-closure preserves the I’-language 
by property (iii). This completes the proof. 


4.2 Factoring the exchange law 


In the basic axioms that generate =, there is no interaction between sequential 
and parallel composition. One sensible way of adding that kind of interaction is, 
as suggested by Hoare, Struth and collaborators [11], by adding an axiom of the 
form (e || f)-(g || R) < (e - g) || (f - h), known as the exchange law. Essentially, 
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this axiom encodes the possibility of (partial) interleaving: when e- g runs in 
parallel with f - h, one possible behaviour is that, first e runs in parallel with f, 
and then g runs in parallel with h. The core observation of this section is that 
the exchange law can be treated as another set of hypotheses, as we show below, 
and this can then be used to recover the completeness result of CKA [15]. 


Definition 4.24. We write exch for the set 
{ell f) (gll h) < (e-9) || A): efghe Tt 


The semantic effect of adding exch to our hypotheses is that, if U is a pomset 
in a series-parallel language L, and V is a series-parallel pomset subsumed by 
U, then V is in the exch-closure of L. Intuitively, the exch-closure adds pomsets 
that are more sequential, i.e., have more ordering, than the ones already in L. 
Indeed, exch-closure coincides with the downward closure w.r.t. CSP, 


Lemma 4.25. Let L C SP and U € SP. Now U € LL if and only if there 
exists a V € L such that U CSV. 


We have previously shown that exch is complete [15]; as a matter of fact, the 
pivotal result from op. cit. can be presented as follows. 


Theorem 4.26. The set of hypotheses exch is strongly reducible to 0. 


When exch is contained in our hypotheses, it is not immediately clear whether 
those hypotheses can be reduced. What we can do is try to factorise our hypotheses 
into exch and some residual set of hypotheses, and prove strong reducibility for 
that residual set. To this end, we first note that, in some circumstances, the 
H-closure of the exch-closure remains downward-closed w.r.t. °°. 


Lemma 4.27. Suppose that for each e < f € H we have that e = 1 ore =a for 
somea € X, and let L C SP. IfU,V € SP such that U LC V and V € (LI%) 4, 
then U € (L{%)]#. 


Using this fact, we can now show that, under the same precondition, exch U H 
factors into exch and H. This factorisation is what we were looking for: it tells 
us that whenever H strongly reduces to @, so does H U exch. 


Lemma 4.28. Suppose that for each e < f € H we have that e = 1, ore =a 
for somea € X. Then H U exch factorises into exch and H. 


Proof. Since H, exch C H U exch, it should be obvious that H U exch implies both 
H and exch. It remains to show that, if L C SP, then (L{%")|# = Lj #vexch, 
The inclusion from left to right is a consequence of Lemma 4.10(ii)—(iii). 

For the other inclusion, we show that if A C L)#¥**h, then A C (LL%h) 7. 
The proof proceeds by induction on the construction of A C L}#Yech, In the base, 
we have that A C LJ #¥h because A = L; in that case, A C LL C (LJe), E, 

For the inductive step, A C L}# Ych because there exist e < f € H U exch 
and C € PC® such that A = C[[e]], and C[[f]] C LL 4. By induction, we 
then know that C[[f]] € (Ae) JF. On the one hand, if e < f € H, then 
A= C[[e]] € (LJ) |” immediately. On the other hand, if e < f € exch, then 
e] CS [f], and hence C[[fe]} CS? Cl f]] by Lemma 3.2. By Lemma 3.5 and 
Lemma 4.27, it then follows that A = C[[e]] € (L{°%") 7. 
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4.3 Lifting 


A number of reduction procedures already exist at the level of Kleene alge- 
bra [20,7]; ideally, one would like to lift those procedures to CKA. 


Example 4.29. The reductions in Example 4.13 and Example 4.18 worked out 
for terms without ||, and then extended inductively, by defining the reduction of 
e || f to be the parallel composition of the reductions of e and f respectively. 
As a non-example, consider H = {a < 1}. Even though this hypothesis can 
be reduced to Ø within Kleene algebra [5], it is not obvious how this would work 
for pomset languages. In particular, if 1 € L, then 1 || --- || 1 € L for any number 
of 1’s, and hence a || --- || a € L{” for any number of a’s. This precludes the 
possibility of a strong reduction to Ø, because [1]{” is a pomset language of 
unbounded (parallel) width, which cannot be expressed by any e € T [25]. 


We now establish a set of sufficient conditions for such a lifting to work. To 
this end, we first formally define Kleene algebra syntax, axioms and semantics. 


Definition 4.30. Write Tka for the set of Kleene algebra terms, i.e., the terms 
in T that do not contain ||. Furthermore, we write =, for the smallest congruence 
on Ta that is generated by the axioms of = that do not involve ||. 


When e € Twa, it is not hard to see that [e] contains totally ordered pomsets, 
i.e., words, exclusively. Using these definitions, we can now specialise the notions 
of hypotheses, context, and closure to the sequential setting, as follows. 


Definition 4.31. The relation =E is generated from H and = a as before. 
A context C € PC? is sequential if it is totally ordered, i.e., if it is a word 
with one occurrence of x; we write PC for the set of sequential contexts. 
Given a set of hypotheses H and a language L C X*, we define the sequential 
closure of L with respect to H, written Lia as the least language containing L 


such that for alle < f € H and © € PC, if Cf] C LL, then C[fe]] € LJ 


seq’ seq’ 


If || does not occur in any hypothesis, then the definition of sequential closure 
coincides with the closure operator from [7]. Thus, if L C X*, then een Ce, 
The analogue of strong reduction for the sequential setting is as follows. 


Definition 4.32. Suppose that H implies H'. A map r : Tka — Tka is a sequen- 
tial reduction from H to H’ when the following hold: 


(i) for e € Tea, it holds that e =" r(e), and ; 
(ii) for e € Tea, it holds that [elkada = Ire cae: 


H sequentially reduces to H’ if there exists a sequential reduction from H to H’. 


To lift a sequential reduction to a proper reduction, the following class of 
hypotheses will turn out to be useful. 


Definition 4.33. A hypothesis e < f with e,f © Tka is called grounded if 
If] = {W} for some non-empty word (totally ordered pomset) W, and e € Tea. 
We say that a set of hypotheses H is grounded if every e < f € H is grounded. 
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Example 4.84. Any hypothesis of the form e < a,---a, for n > 0 is grounded. 
On the other hand, the hypothesis a < 1 that we saw in the previous example is 
not grounded, since the semantics of 1 contains the empty pomset. 


The closure of a language of words can be expressed in terms of its sequential 
closure, provided that the set of hypotheses is grounded. 


Lemma 4.35. Let H be grounded. If L C X*, then LJ” = Uia Moreover, for 
L, L' C SP, we have that (L || L) 4” = LY || LY”. 
The above then allows us to turn a sequential reduction into a reduction. 


Lemma 4.36. Suppose that H sequentially reduces to H’. If H and H' are 
grounded, then H strongly reduces to H’. 


5 Instantiation to CKA with Observations 


In this section, we will present Concurrent Kleene Algebra with Observations 
(CKAO), an extension of CKA with Boolean assertions that enable the specifica- 
tion of programs with the usual guarded conditionals and loops. We will obtain 
CKAO as an instance of CKAH by choosing a particular set of hypotheses. First, 
we define the set of propositional terms or Boolean observations. 


Definition 5.1. Fix a finite set 2 of primitive observations. The set of propo- 
sitional terms, written Tea, is generated by 


pqzu=tl|T|oeQ2|pva|paAq|p 


The relation =p, is the smallest congruence on Tea s.t. for p,q, r E€ Tea, we have 


pV L =a p pVq=aqdVp pV D=ea T pV (dV r) =s (DV GQ) Vr 
PAT =a p pAq=aqgAp pAD=aa L pA (GAT) Sea (DA Q Ar 
DV (AT) Sea (pV Gq) A (pV r) pA(qV Tr) =ea (pA q) V (pAr) 


We will write p Se, q as a shorthand for pV q =ea q- 

We write At for 2”, the set of atoms of the Boolean algebra. It is well known 
that every a € At corresponds canonically to a Boolean term ma, such that every 
Boolean term p € Tea is equivalent to the disjunction of all ta with Ta Sea p [2]. 
To simplify notation we identify a € At with Ta. 

We can now use Tea in defining the terms and axioms of CKAO, which will 
be given as a CKA over a specific alphabet with the following hypotheses: 


Definition 5.2 (CKAO). We define the terms of CKAO, denoted Texao, as 
TX U Toa), that is, as the CKA terms over Tea U X. We furthermore define the 
following set of hypotheses over Texao: 


bool = {p = q : p,q E Ton S-t. D =en q} contr = {pA q<p-q: p,q E Toa} 
glue = {10 = L}U{p+q=pVq:p,q E€ Toa} obs = bool U contr U exch U glue 
The semantics of CKAO is then given by |—]4%®5. 
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The hypotheses bool contain the boolean identities, and glue identifies the 
disjunction with the union (and their respective units as well). contr specifies that 
if p and q hold simultaneously, then it is possible to observe them in sequence. 
Note that the converse inequality is not included: observing p and q in sequence 
has strictly more behaviour than observing p and q simultaneously, as some 
intervening action can happen between the two observations. 

The above definition gives us the semantics of CKAO as the standard pomset 
language model obtained from taking the obs-closure of the semantics of CKA. 
As a matter of fact, we find by Lemma 4.7 that if e, f € Texao with e =s f then 
fe] 1°Ps = [f]; hence, we already have a sound model of CKAO. 

To prove completeness, we will use the techniques from the previous section. 


First step: reification. We start by using reification to rid ourselves of the 
hypotheses from bool and glue, and to simplify the hypotheses in contr. To this 
end, let contr’ be the set of hypotheses given by {a < a-a: a € At}. Let 
P=AtUD C Tea U X. We define r : XU Ta, > T(L) by setting 


=peTs 
r(a) = eel Pe Ts 
a a=acery 


Lemma 5.3. The hypotheses obs reduce to exch U contr’. 


Proof. By Lemma 4.23, it suffices to show that r is a reification, and that obs 
implies exch U contr’. To see that r is a reification, we check the conditions. 

(i): Ifa € X, then r(a) = a =°S a immediately. Otherwise, if p € Tea, then 
we derive r(p) = dra<pap 2 =slue Vazno =bool V and hence r(p) = p. 


(ii): If a € X, then we already know that r(a) = a. Otherwise, if a € At, then 
r(a)= S$) B=a 
BSeao 


(iii): This property holds because all hypotheses in exch U contr’ preserve 
I-languages, i.e., if e < f € exch U contr’ where [f] € SP(L), then fe] € SP(I’) 
too. It follows that exch U contr’-closure must preserve T-languages. 

(iv): We should show that if e < f € obs, then r(e) <ehUcontr" (f), To this 
end, we analyse the separate sets of hypotheses that make up obs. 


— Let e < f € exch, then e = (goo || go1) - (gio || g11) and f = (goo - gio) || 
(go1 : 911), for some goo, 901; 910,911 € 7. We then find that 


r(e) = (r(goo) || r(go1)) < (r(g10) || r(g11)) 
r(f) = (r(go0) : r(g10)) || (r(go1) - r(g911)) 


hence r(e) < r(f) € exch, and therefore r(e) <echvent” p( f), 
— Let e < f € bool, then e = p and f = q such that p =,, q. In that case, 


p= X a= X a=r(4) 


aSpap aSpaq 


Concurrent Kleene Algebra with Observations 397 
— Let e < f € contr; then e = p A q and f = p- q for p,q E€ Tea. Then 


r(p A q) = 5 a <contr! 5 ad 


< D a) . (E o) =r(p)-r(q) =r(p- q) 


— Let e < f € glue. On the one hand, if e = p V q and f = p + q, then 


rpva= X a= So a+ SO a=r(p)+r(q) =r(pt+4) 


aSpapva aSpap aSpaq 


This also establishes the case for f < e € glue. On the other hand, if e = 0 
and p = L, then r(0) = 0 = Bee it a=r(L). 

To see that obs implies exch U contr’, it suffices to show that obs implies contr’. 
To this end, note that if e < f € contr’, then e = a and f = a-a for some a € At. 
We can then derive that a =°°! a A a Snt @-a, and hence e <° f. 


Second step: factorising. Since contr’ satisfies the precondition of Lemma 4.28, 
we obtain the following. 


Lemma 5.4. The hypotheses exch U contr’ factorise into exch and contr’. 


This means that, by Lemma 4.17 all that remains to do is strongly reduce 
exch and contr’ to Q; we have already taken care of the former in Theorem 4.26. 


Third step: reducing contr’. In [13], we have already shown that contr’ sequentially 
reduces to (). Since contr’ is grounded we find the following, by Lemma 4.36. 


Lemma 5.5. The hypotheses contr’ strongly reduce to 0. 


Last step: putting it all together. Using the above reductions, we can then prove 
completeness of =°S w.r.t. [—]{°°s, and decidability of semantic equivalence, too. 


Theorem 5.6 (Soundness and Completeness of CKAO). Lete, f € Tewo. 


(i) We have e = f if and only if [e]1°s = [f]1°s. 
(ii) It is decidable whether [e]{°°s = [f]1°". 


Proof. For the first claim, we already knew the implication from left to right 
from Lemma 4.7. Conversely, and for the second claim, first note that that obs 
reduces to exchUcontr’ by Lemma 5.3. By Lemma 5.4 and Lemma 4.17, the latter 
reduces to 0, if we apply Theorem 4.26 and Lemma 5.5. By Lemma 4.12, we then 
conclude that obs is complete and decidable, hence establishing the claim. 
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6 Discussion 


The first contribution of this paper is to extend Kleene algebra with hypotheses [7] 
with a parallel operator. The resulting framework, concurrent Kleene algebra with 
hypotheses (CKAH), is interpreted over pomset languages, a standard model of 
concurrency. We start from simple axioms, known to capture equality of pomset 
languages [23]. CKAH allows to add custom axioms, the so-called hypotheses. 
These may be used to include domain-specific information in the language. We 
develop this framework by providing a systematic way of producing from the 
hypotheses a sound pomset language model. We also propose techniques that 
may be used to prove completeness and decidability of the resulting model. 

An important instance of this framework is concurrent Kleene algebra (CKA) 
as presented in [11]. The only additional axiom there, known as the exchange 
law, may be added as a set of hypotheses. We prove that the resulting semantics 
coincides with the (subsumption-closed) semantics of CKA and, more interestingly, 
the completeness proof of [15] can be recovered as an instance of this framework. 

The second contribution is a new framework to reason about programs with 
concurrency: concurrent Kleene algebra with observations (CKAO). CKAO is 
obtained as an instance of CKAH, where we add the exchange law to model 
concurrent behaviour, and Boolean assertions to model control flow. The Boolean 
assertions we consider are as in Kleene algebra with observations (KAO) [13] — in 
fact, CKAO is a conservative extension of KAO. Using the techniques developed 
earlier, we obtain a sound and complete semantics for this algebra. While CKAO 
is similar to concurrent Kleene algebra with tests [12], it avoids the problems 
of the latter by distinguishing conjunction and sequential composition. CKAO 
provides the first sound and complete algebraic theory that seems sensible as a 
framework to reason about concurrent programs with Boolean assertions. 

Future work is to explore other meaningful instances of CKAH. Synchronous 
Kleene algebra [29,26] is a natural candidate for this. We also want to try and de- 
sign domain specific languages, specifically, a concurrent variant of NetKAT [1,8]. 

The class of hypotheses considered in this paper for which decidability and 
completeness may be established systematically is somewhat restrictive; identify- 
ing larger classes of tractable hypotheses is a challenging open problem. 

Because of the compositional nature of our model, the CKAO semantics of a 
program contains behaviours that are not possible to obtain in isolation. These 
behaviours are present to allow the program to interact meaningfully with its 
environment, i.e., when placed in a context. However, for practical purposes one 
might want to close the system, and only consider behaviours that are possible 
in isolation. Studying this semantics remains subject of future work. 

In the semantics of concurrent programs with assertions, it would be natural 
to see atoms as partial instead of total functions. This captures the intuition 
that a thread might not have access to the complete machine state, but instead 
holds a partial view of it. Pseudo-complemented distributive lattices (PCDL) 
have been proposed [12] as an alternative to Boolean algebra, modelling this 
partiality of information. We leave it to future work to investigate the variant of 
CKAO obtained by replacing the Boolean algebra of observations with a PCDL. 
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Abstract. We provide graded extensions of algebraic theories and Law- 
vere theories that correspond to graded monads. We prove that graded 
algebraic theories, graded Lawvere theories, and finitary graded monads 
are equivalent via equivalence of categories, which extends the equiv- 
alence for monads. We also give sums and tensor products of graded 
algebraic theories to combine computational effects as an example of 
importing techniques based on algebraic theories to graded monads. 


1 Introduction 


In the field of denotational semantics of programming languages, monads have 
been used to express computational effects since Moggi’s seminal work [18]. They 
have many applications from both theoretical and practical points of view. 

Monads correspond to algebraic theories [5]. This correspondence gives nat- 
ural presentations of many kinds of computational effects by operations and 
equations [21], which is the basis of algebraic effect [20]. The algebraic perspec- 
tive of monads also provides ways of combining [9], reasoning about [22], and 
handling computational effects [23]. 

Graded monads [27| are a refinement of monads and defined as a monad- 
like structure indexed by a monoidal category (or a preordered monoid). The 
unit and multiplication of graded monads are required to respect the monoidal 
structure. This structure enables graded monads to express some kind of “ab- 
straction” of effectful computations. For example, graded monads are used to 
give denotational semantics of effect systems [12], which are type systems de- 
signed to estimate scopes of computational effects caused by programs. 

This paper provides a graded 
extension of algebraic theories [© Xnm ti€ Tr (X) for each i € {1,...,n} 
that corresponds to monads Sireta) € Trom 
graded by small strict monoidal 
categories. This generalizes N- 
graded theories in [17]. The main ideas of this extension are the following. First, 
we assign to each operation a grade, i.e., an object in a monoidal category that 
represents effects. Second, our extension provides a mechanism (Fig 1) to keep 
track of effects in the same way as graded monads. That is, if an operation f 
with grade m is applied to terms with grade m’, then the grade of the whole 
term is the product m & m’. 


Fig. 1. A rule of term formation. 
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For example, graded algebraic theories enable us to estimate (an overapprox- 
imation of) the set of memory locations computations may access. The side- 
effects theory [21] is given by operations lookup, and update, ,, for each location 
l € L and value v € V together with several equations, and each term represents 
a computation with side-effects. Since lookup; and update, „ only read from or 
write to the location l, we assign {l} € 2” as the grade of the operations in 
the graded version of the side-effects theory where 2” is the join-semilattice of 
subsets of locations L. The grade of a term is (an overapproximation of) the set 
of memory locations the computations may access thanks to the rule in Fig 1. 

We also provide graded Lawvere theories that correspond to graded algebraic 
theories. The intuition of a Lawvere theory is a category whose arrows are terms 
of an algebraic theory. We use this intuition to define graded Lawvere theories. 
In graded algebraic theories, each term has a grade, and substitution of terms 
must respect the monoidal structure of grades. To characterize this structure of 
“oraded” terms, we consider Lawvere theories enriched in a presheaf category. 

Like algebraic theories brought many concepts and techniques to the se- 
mantics of computational effects, we expect that the proposed graded algebraic 
theories will do the same for effect systems. We look into one example out of 
such possibilities: combining graded algebraic theories. 

The main contributions of this paper are summarized as follows. 


— We generalize (N-)graded algebraic theories of [17] to M-graded algebraic 
theories and also provide M-graded Lawvere theories where M is a small 
strict monoidal category. We show that there exist translations between these 
notions and finitary graded monads, which yield equivalences of categories. 

— We extend sums and tensor products of algebraic theories [9] to graded 
algebraic theories. We define sums in the category of M-graded algebraic 
theories, and tensor products as an M x M’-graded algebraic theory made 
from an M-graded and an M’-graded algebraic theory. We also show a few 
properties and examples of these constructions. 


2 Preliminaries 


2.1 Enriched Category Theory 


We review enriched category theory and introduce notations. See [13] for details. 
Let Vo = (Vo, 8, I) be a (not necessarily symmetric) monoidal category. 
Vo is right closed if (—) ® X : Vo > Vo has a right adjoint [X,—] for each 
X € obVpo. Similarly, Vo is left closed if X @ (—) has a right adjoint [X,—] for 
each X € obVo. Vo is biclosed if Vo is left and right closed. 
Let Vo" denote the monoidal category (Vo, ®', I) where @' is defined by 
X Q! Y :=Y @ X. Note that Vo’ is right closed if and only if Vo is left closed. 
We define Vo-category, Vo-functor and Vo-natural transformation as in [13]. 
If Vo is right closed, then Vo itself enriches to a Vo-category V with hom- 
object given by V(X, Y) := [X,Y]. We use the subscript (—)o to distinguish the 
enriched category V from its underlying category Vo. 
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Assume that Vo is biclosed and let A be a Vo-category. The opposite cat- 
egory A? is the Vo'-category defined by A°?(X,Y) = A(Y,X). For any X € 
obA, A(X,—) : A > Vo is a Vo-functor where A(X,—)y,z : A(Y, Z) > 
[A(X, Y), A(X, Z)] is defined by transposing the composition law © of A. A 
Vo'-functor A(—, X) is defined by A°P(X,—) : A? > Vot. 

Let A be a Vo-category. For each X € Vo and C € A, a tensor X ® C is 
an object in A together with a counit morphism v : X > A(C,X & C) such 
that a Vo-natural transformation A(X ® C,—) > [X,A(C,—)] obtained by 
transposing (©) o (A(X ® C, B) ® v) is isomorphic where © is the composition 
in the Vo-category A. A cotensor X th C is a tensor in A°P. For example, if 
Vo = Set, then tensors X @ C are copowers X - C, and cotensors X m C are 
powers C%*, 

A Vo-functor F : A > B is said to preserve a tensor X ® C if Fo.xac ov: 
X > B(FC,F(X ® C)) is again a counit morphism. F preserves cotensors if 
F°P preserves tensors. 

Let ® be a collection of objects in Vo. A Vo-functor F : A — B is said to 
preserve @-(co)tensors if F preserves (co)tensors of the form X @ C (X nm C) 
for each X € @ and C € obA. 


2.2 Graded Monads 


We review the notion of graded monad in [7,12], and then define the category 
GMnd» of finitary M-graded monads. Throughout this section, we fix a small 
strict monoidal category M = (M, ®, I). 


Definition 1 (graded monads). An M-graded monad on C is a lax monoidal 
functor M —> [C, C] where [C, C] is a monoidal category with composition as 
multiplication. That is, an M-graded monad is a tuple (*,7,) of a functor 
x: M x C — C and natural transformations nx : X —> I* X and Um ,ms,x : 
my, * (mz * X) > (mı Q m2) * X such that the following diagrams commute. 


mx X —> I*x(m*X)  my*(mox(m3xX)) “sm *((mg@m3)*X) 


wl = n} I” 


mx(IxX)— > m*X (m1 @m2)*(m3+X) —; > (m1 @m2@ms)*X 


A morphism of M-graded monad is a monoidal natural transformation a : 
(«, 7, u) > (*’, 7, u’), i.e. a natural transformation a : x > *’ that is compatible 
with 7 and p. 


An intuition of graded monads is a refinement of monads: m * X is a com- 
putation whose scope of effect is indicated by m and whose result is in X. The 
monoidal category M defines the granularity of the refinement, and a 1-graded 
monad is just an ordinary monad. Note that we do not assume that M is sym- 
metric because some of graded monads in [12] require M to be nonsymmetric. 
We also deal with such a nonsymmetric case in Example 25. 

A finitary functor is a functor that preserves filtered colimits. In this paper, 
we focus on finitary graded monads on Set. 
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Definition 2. A finitary M-graded monad on Set is a lax monoidal functor 
M — [Set, Set]; where [Set, Set]; denotes the full subcategory of [Set, Set] 
on finitary functors. Let GMndyy denote the category of finitary M-graded 
monads and monoidal natural transformations between them. 
A morphism in GMndy is determined by the restriction to No C Set where 
No is the full subcategory of Set on natural numbers. 
Lemma 3. Let T = (x,n, u) and T’ = (x', 1’, yp’) be finitary M-graded monads. 
There exists one-to-one correspondence between the following. 
1. Morphisms a : T > T. 
2. Natural transformations 3 : * o (M x i) > *' o (M x i) (where i: Xo > Set 
is the inclusion functor) such that the following diagrams commute for each 
n,n’ € Xo, m1,m2 E M and f:n > maxw. 


B 1 
my * n ——————> m * n 


n mı*' f 
n — = I*n P pm i 
mixf my, * (mg * n’) 
ass |p J Jmix'B 
ý ; my * (Mg * n’) my *! (ma *' n’) 
Ix n 


[u 2 Le 
(mi @ mg) * n! ——+ (mı 8 mg) *’ n! 


Proof. By the equivalence [Set, Set] ~ [No, Set] induced by restriction and the 
left Kan extension along the inclusion 7 : No + Set. 


2.3 Day Convolution 


We describe a monoidal biclosed structure on the (covariant) presheaf category 
[M, Set] where M = (M, ®, J) is a small monoidal category [3]. Here, we use the 
subscript (—)o to indicate that [M, Set]o is an ordinary (not enriched) category 
since we also use the enriched version [M, Set] later. 

The ezternal tensor product F X G : M x M —> Set is defined by (F K 
G)(m1, m2) = Fm, x Gmo for any F, G : M —> Set. 
Definition 4. Let F,G : M —> Set be functors. The Day tensor product F & 
G : M —> Set is the left Kan extension Lang(F X G) of the external tensor 
product F X G : M x M > Set along the tensor product & : M x M > M. 


Note that a natural transformation 0: F Š G —> H is equivalent to a natural 
transformation 0m,.m, : Fmi x Gm2 > H(mı ® m2) by the universal property. 
The Day convolution induces a monoidal biclosed structure in [M, Set]o [3]. 


Proposition 5. The Day tensor product makes ([M, Set]o, &, y(Z)) a monoidal 
biclosed category where y : M°P — |M, Set] is the Yoneda embedding y(m) = 
M(m, —). 

The left and the right closed structure are given by |F, G] m = [M, Set]o(F, 
G(m@-—)) and [F, G] m = [M, Set]o(F, G(—@m)) for each m € M, respectively. 

Note that since we do not assume M to be symmetric, neither is [M, Set]. 
Note also that the twisting and the above construction commute: there is an 
isomorphism [M, Set]o’ = [Mt, Set]o of monoidal categories. 
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2.4 Categories Enriched in a Presheaf Category 


We rephrase the definitions of [M, Set]o-enriched category, functor and natural 
transformation in elementary terms. An [M, Set]o-category is, so to say, an “M- 
graded” category: each morphism has a grade m € obM and the grade of the 
composite of two morphisms with grades m and m’ is the product m & m’ of 
the grades of each morphism. Likewise, [M, Set]o-functors and [M, Set]o-natural 
transformations can be also understood as an “M-graded” version of ordinary 
functors and natural transformations. Specifically, the following lemma holds [2]. 


Lemma 6. There is a one-to-one correspondence between (1) an [M, Set]o- 
category C and (2) the following data satisfying the following conditions. 


— A class of objects obC. 

— For each X,Y € obC, a hom objects C(X, Y) € [M, Set]o. 

— For each X € obC, an element 1x € C(X, XJI. 

— For each X,Y,Z € obC, a family of morphisms (Shaan : C(Y,Z)my x 
C(X,Y)mz > C(X,Z)(m1 8 m2)) m, mem 
mə. The subscripts mı and Mə are often omitted. 


which is natural in mı and 


These data must satisfy the identity law ly o f = f = f o 1x for each 
f € C(X,Y)m and the associativity (h o g) o f = h o (g o f) for each 
f Ee C(X,Y)mı, g € C(Y, Z)m and h € C(Z,W)ms3. 


Proof. The identity Ix : y(I) > C(X, X) in C corresponds to ly € C(X, X)I 
by the Yoneda lemma, and the composition © : C(Y, Z) @ C(X, Y) > C(X, Z) 
in C corresponds to the natural transformation om;,ma : C(Y, Z)]mıxC(X, Y )m2 
> C(X,Z)(mı ® mz) by the universal property of the Day convolution. The 
rest of the proof is easy. 


An |M, Set]o-functor F : C —> D consists of a mapping X > FX and 
a natural transformation Fy y : C(X,Y) > D(FX, FY) (for each X,Y) that 
preserves identities and compositions of morphisms. An [M, Set]o-natural trans- 
formation @: F —> G is a family of elements (ax € D(FX, GX)1) xeon(c) that 
satisfies ay o Ff = Gf o ax for each f € C(X,Y)m. Vertical and horizontal 
compositions of [M, Set]o-natural transformations are defined as expected. 

We introduce a useful construction of [M, Set]o'-categories. Given an M- 
graded monad (in other words, a lax left M-action) on C, we can define an 
[M, Set]o°-enriched category as follows. 


Definition 7. Let T = (*,n, p) be an M-graded monad on C. An [M, Set]o‘- 
category Cr is defined by obCp := obC and Cr(X,Y)m := C(X,m*Y). The 
identity morphisms are the unit morphisms 7x € Cr(X , X)I, and the composite 
of f € Cr(Y, Z)m and g € Cr(X,Y)m! is po (m*g)o f. 


The definition of Cr is similar to the definition of the Kleisli categories for 
ordinary monads. Actually, Cr can be constructed via the Kleisli category Cr 
for the graded monad T presented in [7] (although Cr itself is not enriched). 
This can be observed by Cr((I, X), (m, Y )) S Cr(X, Y)m. 
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3 Graded Algebraic Theories 


We explain a framework of universal algebra for graded monads, which is a 
natural extension of [17,27]. The key idea of this framework is that each term 
is associated with not only an arity but also a “grade”, which is represented by 
an object in a monoidal category M. We also add coercion construct for terms 
that changes the grade of terms along a morphism of the monoidal category M. 
Then, a mapping that takes m € M and a set of variables X and returns the set 
of terms with grade m (modulo the equational axioms) yields a graded monad. 

We fix a small strict monoidal category M = (M,@,J) throughout this 
section. We sometimes identify n € N with {1,...,n}, or {£1,..., 2n} if it is 
used as a set of variables. 


3.1 Equational Logic 


A signature is a family of sets of symbols X = (X'nm)nensmem- An element 
f € Xn,m is called an operation with arity n and grade m. We define a sufficient 
structure to interpret operations in a category C as follows. 


Definition 8. M-model condition is defined by the following conditions on a 
tuple (C, (®,n®, u®)). 


— Cis a category with finite power. 

— (®,7®, u®) is a strong M*-action (i.e. an M*-graded monad whose unit and 
multiplication are invertible). 

— For each m € M, m ® (—) preserves finite powers: m ® c” S (m ® c)”. 


Example 9. If A is a category with finite powers, then the functor category 
[M, A] has strong Mt-action defined by m ® F := F(m & (—)) and satisfies 
M-model condition. Especially, [M, Set] satisfies M-model condition. 


A model A = (A,|-|4) of X in a category C satisfying M-model condition 
consists of an object A € C and an interpretation |f|4 : A” + m @ A for each 
f E€ Xn,m. A homomorphism a: A— B between two models A, B is a morphism 
a: A —> B in C such that (m ® a) o|f|4 =|f|? o a” for each f € Sam. 


Definition 10. Let X be a set of variables. The set of (M-graded) X-terms 
TŽ (X) for each m € M is defined inductively as follows. 
ce X te T~(X) wimom feZnm Vi € {1,... n}, ti € TŽ (X) 
a € Tj (X) cult) € TZ (X) Siess) ET Zani K 


That is, we build X-terms from variables by applying operations in X and coer- 
cions Cw while keeping track of the grade of terms. When applying operations, 
we sometimes write f (Ai € n.t;) or f(Ai.ti) instead of f(ti,...,tn). 


Definition 11. Let A be a model of a signature X. For each m € M and 
s € TŽ (n), the interpretation |s|4 : A” + m @ A is defined as follows. 
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— For any variable x;, |z;|4 = nË o m; where 7; : A” + A is the i-th projection. 
— For each w : m' > m and s € T2,({21,...,2n}), |Cwls)|4 = (w ® A) o |8|4. 
—If f € Sem and ti € TŽ ({£1,...,£n}) for each i € {1,...,k}, then 
|f(ti,...,tk)|4 is defined by the following composite. 
|ta|,---;Itel) 


A” ( (m!@A)* 3 m'@®A* m elfi m'@®(m'@A) 4 (m'@m")@A 
When we interpret a term t € TŽ (X), we need to pick a finite set n such that 
fv(t) Cn C X where fv(t) is the set of free variables in t, but the choice of the 
finite set does not matter when we consider only equality of interpretations by 
the following fact. If o : n — n’ is a renaming of variables and F : TŽ (n) > 
T~(n’) is a mapping induced by the renaming ø, then for each t € TŽ (n), 
\a(t)|A = |t|A o A7, which implies that equality of the interpretations of two 
terms s,t is preserved by renaming: |s| = |t| implies |o(s)| = |a(s)|. 

An equational axiom is a family of sets E = (Em)mem where Em is a set of 
pairs of terms in T (X). We sometimes identify Æ with its union Umem Em. A 
presentation of an M-graded algebraic theory (or an M-graded algebraic theory) 
isa pair 7 = (X, E) of a signature and an equational axiom. A model A of (X, E) 
is a model of X that satisfies |s|4 = |t|4 for each (s = t) € E. Let Mod7(C) 
denote the category of models of 7 in C and homomorphisms between them. 

To obtain a graded monad on Set from 7, we need a strict left action of 
M on Mod7([{M, Set]o) and an adjunction between Mod7([M, Set])) and Set. 
The former is defined by the following, while the latter is described in §3.2. 


Lemma 12. Let C be a category satisfying Mı x M2-model condition. If T is an 
M,-graded algebraic theory, then C satisfies M-model condition and Mod7(C) 
satisfies M5-model condition. 


Proof. An Mt-action on C is obtained by the composition of M$ x M$-action 
and the strong monoidal functor M} — Mt x M$ defined by m + (m, I). Finite 
powers and an Mf-action for Mod7(C) are induced by those for C. 


Corollary 13. Mod7([M, Set]o) has an M-action, which is given by the pre- 
composition of m ® (—) like the M-action of Example 9. 


Proof. [M,Set]o has M‘ x M-action defined by (m1, m2) * F = F(m, ® (—) ® 
mz). Thus, M-action for Mod7([M, Set]o) is obtained by Lemma 12. 


Substitution s|t1/£1,...,tk/£k] for M-graded X-terms can be defined as 


usual, but we have to take care of grades: given s € TŽ (k) and t1,...,ty € 
T(n), the substitution s[t;/21,...,tk/xs] is defined as a term in T’,,,,,(n). 


We obtain an equational logic for graded theories by adding some additional 
rules to the usual equational logic. 


Definition 14. The entailment relation 7 F s = t (where s,t € Tm(X)) for 
an M-graded theory 7 is defined by adding the following rules to the standard 
rules i.e. reflexivity, symmetry, transitivity, congruence, substitution and axiom 
in E (see e.g. [26] for the standard rules of equational logic). 
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te Tm (X) THs=t w: m —> m te Tx (X) 
T Feul = cw(t) The, (t) =t 
te T*(X) w:m—>m w: m +m" 
T E eu (Cw(t)) = Cwrow (t) 
f € ZL nym ti € TŽ (X) for each i € {1,...,n} w:m' —> m” 


Tr f(Cw(t1),..-, Cw (tn)) = mawl] (ti, . sortna) 


Definition 15. Given a model A of T, we denote Alt s = t if s,t € TŽ (n) (for 
some n) and |s|4 = |t|4. If C is a category satisfying M-model condition, we 
denote 7, C IF- s =t if A IF s = t for any model A of T in C. 


It is easy to verify that the equational logic in Definition 14 is sound. 


Theorem 1 (soundness). T + s =t implies T,C lk s =t. 


3.2 Free Models 


We describe a construction of a free model F7 X e€ Modņ7([M, Set]o) of a 
graded theory 7 generated by a set X, which induces an adjunction between 
Mod+([M, Set]o) and Set. This adjunction, together with the M-action of 
Corollary 13, gives a graded monad as described in [7]. 


Definition 16 (free model F7 X). Let T = (X, E) be an M-graded theory. 
We define a functor F7 X : M —> Set by F7 Xm := T*(X)/~m for each m € M 
and any X € Set where s ~m t is the equivalence relation defined by T F s = t 
and F7 Xw/(([t)m) = [cw(t)|m for any w : m — m’ where [t]m is the equivalence 
class of t € TŽ (X). For each f € Xn, let Ex : (FTX) >m @ FTX 
be a mapping defined by |f|£" * (lti]m, <- -, [tnlm) = [f(t1,---5tn)]m/am for each 
m € M. We define a model of T by FTX =(FTX,|-|F’*). 


The model F7 X, together with the mapping nx : X > F7 XI defined by 
x ++ {a];, has the following universal property as a free model generated by X. 


Lemma 17. For any model A in |M, Set]o and any mapping v : X — AI, there 
exists a unique homomorphism 0: FT X —> A satisfying Ur o nx =v. 


Corollary 18. Let U : Mod7([{M, Set]o) > Set be the forgetful functor defined 
by the evaluation at I, that is, UA = Ar and Ua = ar. The free model functor 
FT : Set + Mod7([M, Set]o) is a left adjoint of U. 


By considering the interpretation in the free model, we obtain the following 
completeness theorem. 


Theorem 19 (completeness). 7,[M, Set]o I- s = t implies T F s = t. 


Recall that Modz7([M, Set]o) has a left action (Corollary 13). Therefore the 
above adjunction induces an M-graded monad as described in [7]. 

The relationship between Mod7([M, Set]o) and the Eilenberg—Moore con- 
struction is as follows. In [7], the Eilenberg-Moore category CT for any graded 
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monad T on C is introduced together with a left action ® : M x CT — CT. If 
C = Set and T is the graded monad obtained from an M-graded theory 7, then 
the Eilenberg-Moore category Set” is essentially the same as Mod7([M, Seto). 


Theorem 20. The comparison functor K : Mod7([M, Set]o) > Set? (see [7] 
for the definition) where T is an M-graded theory and T is the graded monad 
induced from the graded theory T is isomorphic. Moreover, K preserves the M- 
action: ®o (Mx K)=Ko@®. 


We define the category GSm of graded algebraic theories as follows. 


Definition 21. Let T = (X, E) and J’ = (2", E’). A morphism a : T > T’ 
between graded algebraic theories is a family of mappings Qnm : Xn,m > F T nm 
from operations in X to X'-terms such that the equations in E are preserved by 


a, i.e. for each s,t € TŽ (X), (s,t) € E implies |s|F7 Xa) = tE” Xa) where 


(FTX, a) is a model of T induced by a. 


Definition 22. Given a morphism a: T > T’, let F® : FT > FT be a natural 
transformation defined by F'([t]) = ET Xa) for each t € TŽ (X). 


Definition 23. We write GS for the category of graded algebraic theories 
and morphisms between them. The identity morphisms are defined by 17(f) = 
[f(x1,-.-,2n)] for each f € Xm. The composition of a: T > T’ and 8: T’ > 
T” is defined by 8 o a(f) = FÊ (a( f)). 


3.3 Examples 


Example 24 (graded modules). Let M = (N, +, 0) where N is regarded as a 
discrete category. Given a graded ring A = @ nen An, let X be a set of operations 
which consists of the binary addition operation + (arity: 2, grade: 0), the unary 
inverse operation — (arity: 1, grade: 0), the identity element (nullary operation) 
0 (arity: 0, grade: 0) and the unary scalar multiplication operation a- (—) (arity: 
1, grade: n) for each a € An. Let E be the equational axiom for modules. 

A model (F, |-|) of the M-graded theory (X, Æ) in [M, Set]o consists of a set 
F,, for each n € N and functions |+|, : (Fn)? > Fn, |-|ln: Fn > Fn, Oln € Fn 
and ja- (—)|n : Fn —> Fm4+n for each n € N and each a € Am, and these 
interpretations satisfy Æ. Therefore models of (X, E) in [M, Set]o correspond 
one-to-one with graded modules. 


Example 25 (graded exception monad [12, Example 3.4]). We give an 
algebraic presentation of the graded exception monad. 

Let M and (x, 7, 41) be a preordered monoid and the graded monad defined as 
follows. Let P*(X) denote the set of nonempty subsets of X. Let Ex be a set of 
exceptions and M = ((P+ (Ex U {Ok}), C), I, @) be a preordered monoid where 
I = {Ok} and the multiplication & is defined by m & m’ = (m \ {Ok}) Um’ if 
Ok € m and m ® m’ = m otherwise (note that this is not commutative). The 
graded exception monad (x, 7, u) is the M-graded monad given as follows. 
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mx X = {Er(e) | e € m \ {Ok}} U {Ok(z) |x €E X A Ok € m} 
nx(x) = Ok(x)  Hmıma,x (Er(e)) = Er(e) Hm, ,m2,x(Ok(2)) = x 


The M-graded theory 7°* for the graded exception monad is defined by 
(2%, Ø) where X° is the set that consists of an operation raise, (arity: 0, grade: 
{e}) for each e € Ex. 

The graded monad induced by 7°% coincides with the graded exception 
monad. Indeed, the free model functor F7 for T™ is given by F7 Xm = 
m * X. Here, the operations raise, are interpreted by e € Ex. 


lraise.|~ X =Er(e) € FT” X({e} 8 m) 


Example 26 (extending an ordinary monad to an M-graded monad). 
We consider the problem of extending an M’-graded theory to an M-graded 
theory along a lax monoidal functor of type M’ —> M, but here we restrict 
ourselves to the case of M’ = 1 and the strict monoidal functor of type 1 > M. 

Let M = (M, I, &) be an arbitrary small strict monoidal category. Let T = 
(X, E) be a (1-graded) theory and (T, n”, uT) be the corresponding ordinary 
monad. Let 7M = (XM, EM) be the M-graded theory obtained when we regard 
each operation in J as an operation with grade I € M, that is, aT i = Na it 
m =I and XM, := 0 otherwise, and EM := E. 

The free model functor for TM is F™ X = FT(M(I,—) x X) where FT : 
Set — Modņ7 (Set) is the free model functor for T as a 1-graded theory, and the 
interpretation of an operation f € Xn in F T™ X is defined by the interpretation 
in the free models of T. 

FLEX = [p/P Mm) (F7 (MU, m) x X))" > F7(M(I,m) x X) 


m 


Intuitively, this can be understood as follows. Since all the operations are of 
grade I, coercions Cw in a term can be moved to the innermost places where 


variables occur by repeatedly applying cw(f(ti;..-,tn)) = f(cw(ti),.--,Cwltn)) 
(see Definition 14). Therefore, we can consider terms of 7M as terms of T whose 
variables are of the form c,(2). 

An M-graded monad («, 7, p) obtained from 7™ is as follows. 


m*X=T(M(I,m)xX) n= (11,-) w=T(@xX)op™ oTst 


Here, ® : M(I, mı) x M(I, m2) > M(I, mı ® mə) is induced by ®: M x M > 
M and stx yy: X x TY > T(X x Y) is the strength for T. 


4 Graded Lawvere Theories 


We present a categorical formulation of graded algebraic theories of §3 in a 
similar fashion to ordinary Lawvere theories. 

For ordinary (single-sorted) finitary algebraic theories, a Lawvere theory is 
defined as a small category L with finite products together with a strict finite- 
product preserving identity-on-objects functor J : No? —> L where No is the full 
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subcategory of Set on natural numbers. Intuitively, morphisms in the Lawvere 
theory L are terms of the corresponding algebraic theory, and objects of L, which 
are exactly the objects in obNo, are arities. 

According to the above intuition, it is expected that a graded Lawvere theory 
is also defined as a category whose objects are natural numbers and morphisms 
are graded terms. However, since terms in a graded algebraic theory are stratified 
by a monoidal category M, mere sets are insufficient to express hom-objects of 
graded Lawvere theories. Instead, we take hom-objects from the functor category 
[M, Set]o and define graded Lawvere theories using [M, Set]o-categories where 
[M, Set], is equipped with the Day convolution monoidal structure. Specifically, 
No (in ordinary Lawvere theories) is replaced with an [M, Set]o-category Nm, 
L with an [M, Set]o-category, and “finite products” with “Njj-cotensors” . 

So, we first provide an enriched category Nm that we use as arities. Since we 
do not assume that M is symmetric, Nm is defined to be an [M, Set]o‘-category 
so that the opposite category Ny; is an [M, Set]o-category. Let [M, Set]* be an 
[M, Set]o’-category induced by the closed structure of [M, Set]o’. That is, hom- 
objects of [M, Set]* are given by [M, Set]'(G, H)m = [M, Set]o(G, H(— @ m)). 


Definition 27. An [M, Set]o°-category Nm is defined by the full sub-[M, Set] ’- 
category of [M, Set] whose set of objects is given by obNm = {n- y(1) | n € 

N} C ob[M, Set] where N is the set of natural numbers and n - y(JI) is the 

n-fold coproduct of y(I). We sometimes identify obNm with N via the mapping 

neon=n-y(l). 


Lemma 28. The [M, Set]o-category Nyy has Nyt-cotensors, which are given 
byn An =n-n' for each n and n. 


Proof. A cotensor (n+ y(Z)) th (n’- y(Z)) is a tensor (n - y(L)) 8t (n - y(T)) in 
[M, Set]. Since @* is biclosed, ®* preserves colimits in both arguments. There- 
fore, (n - y(I)) 8* (w - y(Z)) S (n-n’)- y(Z). 


Niycotensors (i.e. n- y(Z) h C) behave like an enriched counterpart of finite 
powers (—)". We show that Nyj-cotensors in a general [M, Set]o-category A 
are characterized by projections satisfying a universal property. Given a unit 
morphism v : n —> A(n h C,C) of the cotensor n mh C, an [M, Set]o-natural 
transformation 7: A(B,n th C) > [n, A(B, C)] is given by f > (x œ v(x) o f). 
The condition that Y is isomorphic can be rephrased as follows. 


Lemma 29. An |M, Set]o-category A has Nyy-cotensors if and only if for any 
n € N and C € obA, there exist an object n ù C E€ obA and (T1,..., Tn) E 
(A(n h C,C)I)” such that the following condition holds: for each m, the function 
f= (mio f,..., Tn © f) of type A(B,nh C)m > (A(B,C)m)” is bijective. 

An |M, Set]o-functor F : A > B preserves Nyj-cotensors if and only if 
(Fanc.c.r © ™,---;Fanc.cr © Tn) E (B(F(n h C),FC)I)” satisfies the same 
condition for each n and C. 


Proof. The essence of the proof is that the unit morphism v : n - y(T) > A(n h 


C,C) corresponds to elements 7,...,7 E A(n h C,C)I by [M, Set]o(n - 
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y(I), A(n th C,C)) = [M, Set]o(y(Z), A(n th C,C))" = (A(n h C,C)I)”. The 

[M, Set]o-natural transformation Y is isomorphic if and only if each component 

Dm : A(B,n d C)m > [n, A(B,C)|m of V is isomorphic, which is moreover 

equivalent to the condition that f œ (mı © f,...,7 © f): A(B,n h C)m > 

(A(B,C))” is isomorphic since we have [n, A(B,C)]m = (A(B,C)m)”. 
The latter part of the lemma follows from the former part. 


If (™,.--,7n) E (A(n h C,C)I)” satisfies the condition in Lemma 29, we call 
the element m; E A(n th C,C)I the i-th projection of n h C. Note that the 
choice of projections is not necessarily unique. However, when we say that A is 
an [M, Set]o-category with Njj-cotensors, we implicitly assume that there are 
a chosen cotensor n h C and chosen projections (77,...,7n) E (A(n A C,C)I)” 
for each n € obNgy and C € obA. We also assume that 1 rh X = X without loss 
of generality. Given n-tuple (f1,..., fn) of elements in A(B,C)m, we denote by 
(fi,---;fn) an element in A(B,n th C)m obtained by the inverse of f > (mı o 
Í, ---;Tn © f) and call this a tupling. Tuplings and projections for Njyjj-cotensors 
behave like those for finite products. 

The following proposition claims that Ny} is a free [M, Set]o-category with 
chosen Nxy-cotensors generated by one object. 


Proposition 30. Let A be an [M, Set]o-category with Ny;-cotensors and C be 
an object in A. Then there exists a unique N}j-cotensor preserving [M, Set] - 
functor F : Nur +> A such that Fn =n MC and Fr; = mi. 


We define M-graded Lawvere theories in a similar fashion to enriched Law- 
vere theories. 


Definition 31. An M-graded Lawvere theory is a tuple (L, J) where L is an 
[M, Set]o-category with N}j-cotensors and J : Ny; —> L is an identity-on- 
objects Nyy-cotensor preserving [M, Set]o-functor. A morphism F : (L, J) > 
(L’, J’) between two graded Lawvere theories is an [M, Set]o-functor F : L > L’ 
such that FJ = J’. We denote the category of graded Lawvere theories and 
morphisms between them by GLawm. 


By Proposition 30, the existence of the above J : Nyy > L is equivalent to 
requiring that obL = N and projections in L are chosen in some way. So, we 
sometimes leave J implicit and just write L € GLawy for (L, J) € GLawm. 


Definition 32. A model of graded Lawvere theory L in an |M, Set]o-category 
A with Nyj-cotensor is an Nji-cotensor preserving [M, Set]o-functor of type 
L —> A. A morphism a: F — G between two models F,G of graded Lawvere 
theory L is an [M, Set]o-natural transformation. Let Mod(L, A) be the category 
of models of graded Lawvere theory L in the [M, Set]o-category A. 


In §3, we use a category C satisfying M-model condition to define a model 
of graded algebraic theory. Actually, M-model condition is sufficient to give an 
[M, Set]o-category with N}j-cotensors. 


Lemma 33. IfC satisfies M-model condition, then the [M, Set] -category ce 
defined in Definition 7 has Nyy-cotensors. 


Graded Algebraic Theories 413 


Proof. For any X € Cr and n, the cotensor n ħ X is given by finite power 
ee 

X”, and the i-th projection is given by 7® o m; € Cr PT where Tti: X” + X is 

the i-th projection of the finite power X”. The rest of the proof is routine. 


If we apply Lemma 33 to [M, Set]o equipped with the Mt-action in Exam- 


P a op 
ple 9 (here denoted by T), then ([M, Set]o)p coincides with [M, Set] (i.e. the 
[M, Set]o-category obtained by the closed structure of [M, Set]o). 


5 Equivalence 


We have shown three graded notions: graded algebraic theories, graded Law- 
vere theories and finitary graded monads, which give rise to categories GSm, 
GLawm and GMnd,y, respectively. This section is about the equivalence of 
these three notions. We give only a sketch of the proof of the equivalence, and 
the details are deferred to [14, Appendix A]. 


5.1 Graded Algebraic Theories and Graded Lawvere Theories 


We prove that the category of graded algebraic theories GSm and the category 
of graded Lawvere theories GLawy, are equivalent by showing the existence of 
an adjoint equivalence Tht U : GLawm > GSm. 

Let M be a small strict monoidal category and T = (X, E) be an M-graded 
algebraic theory. We define ThT (the object part of Th) as an M-graded Law- 
vere theory whose morphisms are terms of 7 modulo equational axioms. 


Definition 34. An [M, Set] -category Th7 is defined by ob(Th7) := N and 
(ThT)(n,n')m = (FT nm)” with composition defined by substitution. 


It is easy to show that Th7 has Nyj-cotensors (by Lemma 29). Therefore, 
Th is a mapping from an object in GSm to an object in GLawy,. 

We define a functor U : GLawm — GSpm by taking all the morphism 
f € L(n,1)m in L € GLaw my as operations and all the equations that hold in 
L as equational axioms. 


Definition 35. A functor U : GLaw pm — GS y is defined as follows. 


— For each L € obGLawy,, UL = (X, E) where Xn,m = L(n,l)m, E = 
{(s,t) | |s| = |t|} and |- |} : TŽ (n) > L(n,1)m is an interpretation o 
terms defined in the same way as Definition 11. 

— Given G : L > L’, let UG : UL > UL’ be a functor defined by UG(f) 
[G(f)(£1,-..,£n)] for each f € L(n, 1)m. 


h 


Then, Th7 has the following universal property as a left adjoint of U. 


Lemma 36. For each T, let nr :T + UThT be a family of functions NT, n,m : 
Xam > FUTT nm defined by NT nml f) = [[f(£1,.--,En)](£1,---,En)]. For 
any a: T — UL, there exists a unique morphism a : ThT — L such that 
a=Uaonr. 
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Moreover, the unit and the counit of Th 4 U are isomorphic. Therefore: 


Theorem 37. Two categories GSm and GLawyy are equivalent. 


We can also prove the equivalence of the categories of models. 
Lemma 38. If C is a category satisfying M-model condition, then Mod7 (C) is 
equivalent to Mod(ThT, Cr) where T is the M'-action on C. 


5.2 Graded Lawvere theories and Finitary Graded Monads 


We prove that the category of graded Lawvere theories GLawm and the category 
of finitary graded monads GMndy are equivalent. Given a graded Lawvere 
theory, a finitary graded monad is obtained as a coend that represents the set 
of terms. On the other hand, given a finitary graded monad, a graded Lawvere 
theory is obtained from taking the full sub-[M, Set]-category on arities ob(Nyt) 
of the opposite category of the Kleisli(-like) category in Definition 7. These 
constructions give rise to an equivalence of categories. 

An M-egraded Lawvere theory yields a finitary graded monad by letting m*X 
be the set of terms of grade m whose variables range over X. 


Definition 39. Let L be an M-graded Lawvere theory. We define Ty, = (*, n, p) 
by a (finitary) M-graded monad whose functor part is given as follows. 


neo 
mxx = | L(n,1)m x X” 


Note that L(—, 1) : Xo —> [M, Set]o is a Set-functor here. 

Given a graded monad, a graded Lawvere theory is obtained as follows. 
Definition 40. Let T = (*,7,~) be an M-graded monad on Set. Let Lr be 
the full sub-[M, Set]o-category of (Setr)°? with ob(Lr) = N. 

Since Ly has Ny,-cotensors n ù 1 = n whose projections are given by 7; = 
(x H n(i)) € Set(1, I *n), Lr is a graded Lawvere theory. 

Given a morphism a: T > T’ in GMndyy, we define La : Lr > Lr by 
(La)n.n'm = Set(n’, anm) : Lr(n,n')m > Lr (n,n’)m. It is easy to prove that 
La is a morphism in GLawy and Li) : GMndm > GLawy is a functor. 


Theorem 41. Two categories GLawy and GMndy are equivalent. 


Proof. Ly_) is an essentially surjective fully faithful functor. 


6 Combining Effects 


Under the correspondence to algebraic theories, combinations of computational 
effects can be understood as combinations of algebraic theories. In particular, 
sums and tensor products are well-known constructions [9]. In this section, we 
show that these constructions can be adapted to graded algebraic theories. By 
the equivalence GMndm ~ GLawy ~ GSm in 85, constructions like sums 
and tensor products in one of these categories induce those in the other two 
categories. So, we choose GSm and describe sums as colimits in GSm and 
tensor products as a mapping GSm: X GSM, > GSm,xm,- 
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6.1 Sums 
We prove that GSm has small colimits. 
Lemma 42. The category GSm has small coproducts. 


Proof. Given a family {(2,E)};er of objects in GSm, the coproduct is 
obtained by the disjoint union of operations and equations: [];-;(2’ ©, B®) = 


(Uier z0, Vier EM) j 


Lemma 43. The category GSyy has coequalizers. 


Proof. Let T = (X, E) and J’ = (%”,E’) be graded algebraic theories and 
a, B :T —> T’ be a morphism. The coequalizer 7” of a and £ is given by adding 
the set of equations induced by a and 8 to T’, that is, 7” := (X', E'U E”) where 
E” = {(s,t) | af € X,a(f) = [s] A BF) = [t]} 


Since a category has all small colimits if and only if it has all small coproducts 
and coequalizers, we obtain the following corollary. 


Corollary 44. Three equivalent categories GSm, GMndm and GLawm are 
cocomplete. 


Example 45. It is known that the sum of an ordinary monad T and the excep- 
tion monad (—)+ Ex (where Ex is a set of exceptions) is given by T((—)+Ex) [9, 
Corollary 3]. We show that a similar result holds for the graded exception monad. 

Let 7°* be the theory in Example 25 and M be the preordered monoid used 
there. We denote (*°™, n°™, u°*) for the graded exception monad. Let T = (X, E) 
be a (1-graded) theory and (T, 7, 7) be the corresponding ordinary monad. Let 
T™M = (XM, EM) be the M-graded theory obtained from 7 as in Example 26. 
We consider a graded monad obtained as the sum of T®™ and 7. 

A free model functor F for T°’ + 7™ is given by FXm = T(m ** X). For 
each n-ary operation f in T, |f|£* : (T(m** X))” + T(m+** X) is induced by 
free models of 7, and for each e € Ex, |raise.|¥* : 1 > T({e}*°* X) is defined by 


v 


NMezsex x (€) € T({e} x°% X). It is easy to see that FX defined above is indeed a 
model of 7°* + T™M. Therefore, we obtain a graded monad m * X = T(m*™ X). 


6.2 Tensor Products 


The tensor product of two ordinary algebraic theories (X, Æ) and (X', E’) is 
constructed as (X U X”, E U E’ U Eg) where Eg consists of f(Ai.g(Aj.xij)) = 
g(Aj.f(Ai.xij)) for each f € X and g € X'. However, when we extend tensor 
products to graded algebraic theories, the grades of the both sides are not nec- 
essarily equal. If the grade of f is m and the grade of g is m’, then the grades of 
F(Ai.g(Aj.xij)) and g(Aj. f (Ai.xij)) are m & m’ and m’ ® m, respectively. There- 
fore, we have to somehow guarantee that the grade of f € X and the grade of 
g € &” commute. We solve this problem by taking the product of monoidal cat- 
egories. That is, we define the tensor product of an M,-graded algebraic theory 
and an Mo-graded algebraic theory as an Mı x Mo-graded algebraic theory. 
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Before defining tensor products, we consider extending an M-graded theory 
to M’-graded theory along a lax monoidal functor G = (G, n€, u£) : M > M’. 
Given an M-graded theory T = (X, E), we define the M’-graded theory GT = 
(G2, GE) by (G42) nm = {f € Lam | Gm = m} and GE := {G,(s) = 
G. (t) | (s = t) € E} where for each term t of T (with grade m), G, (t) is the term 
of G.7 (with grade Gm) defined inductively as follows: if x is a variable, then 
G(x) = cna (x); for each w : m —> m’ and term t, G.(cw(t)) = caw(G.(4)); 
for each f € Xn,m and terms t),...,tn with grade m’, G.(f(ti,..-,tn)) = 
ous, (F (Galt), +s Ge(tn))). 

The tensor product of 71 € GSm, and 72 € GSm, is defined by first extending 
Ti and 72 to Mı x Mo2-graded theories and then adding commutation equations. 


Definition 46 (tensor product). Let 7, = (X, E) € GSm, and h = (X, E”) 
E€ GSm,. The tensor product Ti ® 72 is defined by (K.X U K1 X', K,BUKLE'U 
Enen) = GSM: xm, where K : Mı —> M, x Mo. and K’: Mos => Mı x Mo are 
lax monoidal functors defined by Km, = (mı, I2) and K'mə := (I1, m2), and 


ERST = {f(At-g(Aj-viz)) = GAJ-F (Ai-xij)) | f = (K X)n,m: g € (KX ev m} 


That is, if f is an operation in 7; with grade mı € Mı, then J, ® 72 has the 
operation f with grade (my, I2) € Mı x Mə and similarly for operations in 73. 
The tensor products satisfy the following fundamental property. 


Proposition 47. Let C be a category satisfying Mı x M2-model condition. Let 
Ti be an M;-graded algebraic theory for i = 1,2. Then we have an isomorphism 
Mod7, (Modry, (C)) Modh er (C). 


Proof. Let ((A,|- |'),| - |) € Modz (Mod7,(C)) be a model. For each operation 
f in Ti, |f|: (A,|- |" > m @ (A,|- |’) is a homomorphism. This condition is 
equivalent to satisfying the equations in En e7- 


Example 48. We exemplify the tensor product by showing a graded version 
of [9, Corollary 6], which claims that the L-fold tensor product of the side-effects 
theory in [21] with one location is the side-effects theory with L locations. 

First, we consider the situation where there is only one memory cell whose 
value ranges over a finite set V. Let 2 the preordered monoid (join-semilattice) 
({L, T},<,V,1L) where < is the preorder defined by L < T. Intuitively, L rep- 
resents pure computations, and T represents (possibly) stateful computations. 
Let Tst be a 2-graded theory of two types of operations lookup (arity: V, grade: 
T) and update, (arity: 1, grade: T) for each v € V and the four equations in [21] 
for the interaction of lookup and update. Note that we have to insert coercion to 
arrange the grade of the equation lookup(Av € V.update, (x)) = c, <T (£). 

The graded monad (x, 7, u) induced by Tss is as follows. 

LeX=X TeX =(VxX)¥  ((L<T)*X)(a) = dv.(v,2) 

The middle equation can be explained as follows: any term with grade T can 
be presented by a canonical form tr := lookup(Av.update s, (v) (fx (v))) where 
f = (fv. fx): V —> V x X is a function, and therefore, the mapping f +> ts 
gives a bijection between (V x X)“ and T * X = T#(X)/~. 
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The L-fold tensor product of Tst, which we denote by jae , is a 24-graded the- 
ory where 2% = (24, C, U, Ø) is the join-semilattice of subsets of L. Specifically, 
TŽ” consists of operations lookup; and update; ,, with grade {l} for each l € L 
and v € V with additional three commutation equations in [21]. The induced 
graded monad is L’ x84 X = {f : VE > (V¥ x X) | read(L’, f) A write(L’, f)} 
where L’ C L, and read(L’, f) and write(L’, f) assert that f depends only on 
values at locations in L’ and does not change values at locations outside L’. That 
is, L’ x8} X represents computations that touch only memory locations in L’. 


read(L', f) = Vo,o' € V”, (VLE L’, o(l) = (Ù) f(a) = flo") 
write( L’, f) = Vo,o' € V”,x € X, (o',x)= flo) = VEL’, o(l) =o' (I) 


7 Related Work 


Algebraic theories for graded monads. Graded monads are introduced in [27], and 
notions of graded theory and graded Eilenberg—Moore algebra appear in [4, 17] 
for coalgebraic treatment of trace semantics. However, these work only deal 
with N-graded monads where N is regarded as a discrete monoidal category, 
while we deal with general monoidal categories. The Kleisli construction and 
the Eilenberg—Moore construction for graded monads are presented in [7] by 
adapting the 2-categorical argument on resolutions of monads [29]. 

Algebraic operations for graded monads are introduced in [12] and classified 
into two types, which are different in how to integrate the grades of subterms. 
One is operations that take terms with the same grade, and these are what 
we treated in this paper. The other is operations that take terms with different 
grades: the grade of f (tı, ... , tn) is determined by an effect function e : M” > M 
associated to f. Although the latter type of operations is also important to give 
natural presentations of computational effects, we leave it for future work. 


Enriched Lawvere theories. There are many variants of Lawvere theories [1, 
10, 11, 15, 16, 19, 24, 25, 28], and most of them share a common pattern: they 
are defined as an identity-on-objects functor from a certain category (e.g., Ng”) 
which represents arities, and the functor must preserve a certain class of products 
(or cotensors if enriched). Among the most relevant work to ours are enriched 
Lawvere theories [24] and discrete Lawvere theories [10]. 

For a given monoidal category V, a Lawvere V-theory is defined as an 
identity-on-objects finite cotensor (i.e. Wg,-cotensor) preserving V*-functor J : 
ve — L where Vp is the full subcategory of V spanned by finitely presentable 


objects. If V = [M,Set]o’, Lawvere [M, Set]o’-theories are analogous to our 
graded Lawvere theories except that we used Nọ} instead of ([M, Set]o)fp. Since 
n- y(I) € Nj; is finitely presentable, we can say that the notion of graded Law- 
vere theory is obtained from enriched Lawvere theories by restricting arities to 
Nu C ([M, Set]o)¢. However, the correspondence to finitary graded monads on 
Set is an interesting point of our graded Lawvere theories compared to Lawvere 
V-theories, which correspond to finitary V-monads on V. 
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Discrete Lawvere theories restrict arities of Lawvere V-theories to No, that 
is, a discrete Lawvere V-theory is defined as a (Set-enriched) finite-product 
preserving functor J : Xo? — Lo where L is a V‘-category. Actually, discrete 
Lawvere [M, Set]o’-theories are equivalent to graded Lawvere theories because 
there is a finite-product preserving functor v : No? —> Nọ} such that the com- 
position with ų gives a bijection between graded Lawvere theories J : Ny > L 
and discrete Lawvere [M, Set] ‘-theories Jo ou : NPP — Lo. However, we con- 
sidered not only symmetric monoidal categories but also nonsymmetric ones, 
which cause a nontrivial problem when we define tensor products of algebraic 
theories. The problem is that adding commutation equations requires some kind 
of commutativity of monoidal categories. We solved this problem by considering 
product monoidal categories and defining the tensor product of an M-graded 
theory and an Mo-graded theory as an Mı x Mo-graded theory, and the use of 
two different monoidal categories is new to the best of our knowledge. 


8 Conclusions and Future Work 


To extend the correspondence between algebraic theories, Lawvere theories, and 
(finitary) monads, we introduced notions of graded algebraic theory and graded 
Lawvere theory and proved their correspondence with finitary graded monads. 
We also provided sums and tensor products for graded algebraic theories, which 
are natural extensions of those for ordinary algebraic theories. Since we do not 
assume monoidal categories to be symmetric, our tensor products are a bit dif- 
ferent from the ordinary ones in that this combines two theories graded by (or 
enriched in) different monoidal categories. We hope that these results will lead 
us to apply many kinds of techniques developed for monads to graded monads. 
As future work, we are interested in “change-of-effects”, that is, changing 
the monoidal category M in M-graded algebraic theory along a (lax) monoidal 
functor F : M —+ M’. The problem already appeared in §6.2 to define tensor 
products, but we want to look for more properties of this operation. We are 
also interested in integrating a more general framework for notions of algebraic 
theory [6] and obtaining a graded version of the framework. Another direction 
is exploiting models of graded algebraic theories as modalities in the study of 
coalgebraic modal logic [4,17] or weakest precondition semantics [8]. 
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A Curry-style Semantics of Interaction: 


From untyped to second-order lazy Ap-calculus 


James Laird 


Department of Computer Science, University of Bath, UK 


Abstract. We propose a “Curry-style” semantics of programs in which 
a nominal labelled transition system of types, characterizing observable 
behaviour, is overlaid on a nominal LTS of untyped computation. This 
leads to a notion of program equivalence as typed bisimulation. 

Our semantics reflects the role of types as hiding operators, firstly via an 
axiomatic characterization of “parallel composition with hiding” which 
yields a general technique for establishing congruence results for typed 
bisimulation, and secondly via an example which captures the hiding 
of implementations in abstract data types: a typed bisimulation for the 
(Curry-style) lazy Ay-calculus with polymorphic types. This is built on 
an abstract machine for CPS evaluation of Ay-terms: we first give a 
basic typing system for this LTS which characterizes acyclicity of the 
environment and local control flow, and then refine this to a polymorphic 
typing system which uses equational constraints on instantiated type 
variables, inferred from observable interaction, to capture behaviour at 
polymorphic and abstract types. 


1 Introduction 


“Church-style” and “Curry-style” are used to distinguish programming lan- 
guages in which the type of a term is intrinsic to its definition from those in 
which it is an extrinsic property. The same distinction may be applied to se- 
mantics of programming languages: in many models, type-objects are essential 
to the interpretation of a term — e.g. as a morphism between objects (types) 
in a category — but interpreting terms independently of their types (as in e.g. 
realizability interpretations) may have conceptual and practical advantages, par- 
ticularly for describing Curry-style type systems. The aim of this semantic in- 
vestigation of higher-order programs is to develop a Curry-style semantics of 
interaction by overlaying a labelled transition system of types onto a LTS of 
untyped computation, so that the observable behaviour of a typed state is re- 
stricted to the actions made available by its type. Our objective is to apply this 
to lazy functional programs: untyped and with Curry-style polymorphic typing 
systems, and to develop a theory of program equivalence — typed bisimulation 
— able to describe genericity and abstract datatypes in this setting. 


Game Semantics Games models for programming languages are typically (but 
not invariably) given in a Church-style: terms are interpreted as strategies on 
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a specified two-player game which represents their type 2J9]. This kind of se- 
mantics is compositional by definition, at the cost of forgetting the internal 
computational behaviour of programs, and potentially excluding system level 
behaviour [6]. It uses categorical structure to describe its models and prove key 
results — in particular soundness with respect to an operational semantics. 

By contrast, in operational game semantics [15/12], programs are interpreted 
as states in a labelled transition system based directly on their syntax and oper- 
ational semantics. Internal computation is retained but can be factored out by 
restricting to observable behaviour. Soundness of these models “comes for free” 
— instead, the fundamental property requiring non-trivial proof is that they are 
compositional — that is, the equivalence induced on programs is a congruence. 
Basic structure which supports and systematizes these proofs would be useful 
(techniques such as Howe’s method are not available in this intensional setting). 
We aim to show that defining operational game semantics in a Curry style gives 
the opportunity to formulate and apply such structure. This is complementary 
to characterization of the structure of operational game semantics at a categor- 
ical level [18], into which we believe our semantics can fit well. Our motivation 
and general methodology bears similarities to the programme of Berger, Honda 
and Yoshida [3] —- in which Curry-style types are used to characterize the 7- 
calculus processes corresponding to functional and polymorphic programs — and 
to typing systems for process calculi such as those described in [IO]. 


Hiding using types We will interpret (extrinsic) types as hiding operators: 
windows through which terms of a given type may interact with the world, while 
their internal behaviour is hidden from external observation — both passive and 
active. Our goal is to show that this interpretation can be used to model infor- 
mation hiding in two key areas of higher-order computation. The first, “parallel 
composition with hiding” is the fundamental operation on which game semantics 
is based. We axiomatize the notion of a typing system for an LTS with such an 
operation, in which a type is a state which characterizes precisely the possible 
interaction between a function and its argument at that type. 

The second form of information hiding for which we give a Curry-style in- 
terpretation is hiding of implementation details using polymorphic (existential) 
types as abstract data types. Our key example of a typed labelled transition 
systems is a new model of the second-order Ay-calculus: we shall now discuss 
the background and significance of this contribution. 


1.1 Program Equivalence and Polymorphism 


Our starting point is the lazy \-calculus — the pure, untyped A-calculus, evalu- 
ated by weak head reduction — and its extension with first-class continuations, 
the corresponding version of Parigot’s Ay-calculus [21]. As argued in [I], the 
lazy A-calculus approximates well to the behaviour of lazy functional program- 
ming languages such as Haskell, and is thus an appropriate setting in which to 
explore properties such as program equivalence, for which there is now a rich 
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and well-studied theory. For instance, open or normal form bisimilarity [25] is a 
coinductively defined equivalence which extends 6-equivalence to infinitary be- 
haviours. It gives a purely intensional characterization of program equivalence 
(by contrast to e.g. applicative bisimilarity, which involves quantifying over all 
possible arguments) and has a variety of alternative characterizations — for 
instance two terms are open bisimilar if and only if they have the same Levy- 
Longo trees [I9], or their (call-by-name) translations in the z-calculus are weakly 
bisimilar [2515]. (Or, indeed, if they are normal-form bisimilar as \y-terms.) 


Normal form bisimilarity of simply-typed A-terms is just 5-equivalence. How- 
ever, extending to polymorphic types, such as those of the second-order A- 
calculus (System F) poses deeper questions. A primary motivation for in- 
troducing polymorphic types is that they can express abstract data types which 
hide implementation details [20] (cf. the module systems of Haskell and ML). A 
useful notion of program equivalence should therefore reflect this. As a simple 
example, the untyped A-terms Af. f Ax.Ay.x and Af. f Av.Ay-y are clearly not nor- 
mal form bisimilar. But at the second-order type 1X.X £ VY.(VX.X > Y) > Y 
(which they both inhabit in a Curry-style presentation), they should be be- 
haviourally equivalent — since any function of type : VX.(X — Y) will never call 
its argument. In other words, the existential type 1X.X “hides” the difference 
between Af. fAxv.Ay.c and Af. fAx.Ay.y. This is an observational equivalence, 
but of a particularly fundamental kind, since it (and other equivalences involv- 
ing abstract data types) is robust in the presence or absence of side-effects. It 
can be captured by extensional methods such as applicative bisimilarity, which 
was extended to a polymorphic setting in [26], but this requires quantification 
over instantiating terms and types, whereas our semantics is based on unification 
of instantiating types. 


The problem is that comparing the evaluation trees of terms (e.g. by nor- 
mal form bisimulation) does not capture the capacity of their types to restrict 
interaction with the environment. Game semantics does reflect this interaction 
(in various manifestations), and therefore offers a potential solution. Although 
several games models for polymorphism do not capture data abstraction by ex- 
istential types (including Hughes’ semantics of System F [8], which is faithful 
with respect to $7-equivalence, and Curry-style models [16]) a series of related 
approaches does so. These include translation into the (polymorphically typed) 
m-calculus [4], and an operational form and a traditional compositional 
presentation [14]13] of game semantics. 


In these semantics, values of polymorphic variable type are interpreted as 
pointers to data of undisclosed type — e.g. a location where it is stored, or 
a channel on which it may be received. Instantiation of universally quantified 
type variables replaces this pointer-passing with copycat behaviour. This gives a 
natural interpretation of polymorphism in settings such as the z-calculus, or lan- 
guages with general references, where pointers are first-class objects. However, it 
is closely associated with a Church-style presentation of second-order type sys- 
tems — e.g. by the interpretation of type abstraction as an explicit creation of a 
pointer; in the case of “typed normal form bisimulation” [I7] the translation of 
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a term is explicitly determined by its type. This is significant because it is in the 
presence of polymorphism that key differences between Church-style and Curry- 
style emerge — for example, in allowing intersection types. The pointer-passing 
models also exhibit behaviours which go beyond untyped functional interaction, 
making their relationship to it unclear — in the game semantics [14], instantia- 
tion violates the fundamental innocence and visibility conditions on strategies; 
the z-calculus interpretation uses free name as well as bound name passing. 

Curry-style semantics give a natural interpretation of second-order Curry- 
style typing, with a simple relationship to the semantics of the untyped Ap- 
calculus, by overlaying a more refined LTS of second order types on the same 
underlying LTS of computations. 


2 Typed Labelled Transition Systems 


In this section we describe a notion of typed labelled transition system and an 
associated equivalence: typed bisimulation. Based on this we axiomatize a simple 
typing system for parallel composition with hiding and show that it preserves 
typed bisimulation. Examples of typed LTS (in the form of models of the lazy 
Ap-calculus and lazy Ay2-calculus) follow in the rest of the paper. 

We work in the setting of nominal sets [23], which allows the introduction of 
fresh names (for store locations, communication channels, types etc). Assume a 
fixed, infinite set of atoms and a group G of permutations on them. A nominal 
set X is an action of G on a set |X| such that each x € |X| has a finite supporting 
set of atoms such that if 7(a) = a for all atoms in this set then 7-7 = x. We write 
sup(a) for the C-least of these sets (which is the intersection of all supporting 
sets for x). 


Definition 1. A nominal LTS is a labelled transition system (S, Act,—>) such 
that S (states) and Act (actions) are nominal sets and the transition relation > 


is equivariant — i.e. for any n € G, C = C' if and only ifr: C Z5 r. ©. 


Similarly motivated notions of nominal LTS are developed in e.g. [22]. Our key 
example — an abstract machine for direct-style CPS evaluation — is given in 
the next section. 

The directly observable part of a labelled transition system may be charac- 
terized by defining a typing system for it. (Similar notions of typing system for 
a process calculus are defined in [10], for example.) 


Definition 2. A typing system for a nominal LTS (S; Act;—) is a nominal 
LTS (T; Obs; —) such that Obs C Act, with a relation, 3 (typing), from S to T 
which satisfies the following subject reduction properties for each C } T: 


— If C+ C' and T ST" then C's T’ (we write C 3T + C's T'). 


— If C = C’, where a g Obs and sup(C’) Nsup(T) C sup(C) N sup(T), then 
C' 3T (we write C3T — C's T). 
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Subject reduction requires that actions which are observable (i.e. in Obs) change 
a computation and its type in a way that respects the typing relation, and that 
those which are internal to a computation (i.e. in Act\ Obs) maintain its type 
(provided that any names fresh for the state are also fresh for its type). 

Let => be the reflexive, transitive closure of the internal reduction —, and 
define C8 T => C' 8 T' if C8 T => D 8 T = D' 8 T! => C' 8 T'. To define 
weak bisimulation between typed states based on these relations, we need to 
take account of the fact that a name may be fresh for one, but already occur 
internally in the other (cf. [22]). So bisimulation is defined up to the equivalence 
on the states of type T which allows permutation of internal names: C ~p C” if 
there exists a permutation 7 € stab(T) (i.e. m- T = T) such that C’=7-C. 


Definition 3. A typed bisimulation is a binary, symmetric, equivariant relation 
R between typed states (C 3 S), such that if (C 3 S)R(D8T) then S =T and: 


1. If C3T —= C'3T" then there exists D' ~r D such that (D' : T) => (D": 
T’), where (C8 T’)R(D" : T’). 

2. If CsT — C'sT then there exists D' ~r D such that (D' : T) => (D" : T), 
where (C’8 T)R(D" : T). 


Typed bisimilarity is the largest typed bisimulation: states C and D are bisimilar 
at type T (C ~r D) if (C8 T) and (D8 T) are typed bisimilar. 


2.1 Parallel Composition with Hiding 


Having proposed an interpretation of types as operators which hide internal 
communication, we now characterize the properties of a typing system for parallel 
composition with hiding which entail that it preserves typed bisimulation (i.e. 
the latter is a congruence). 


Definition 4. An interaction structure is a nominal LTS (S; Act;—) such that 

Act = LU({+,—}x L) for some set of L of (unpolarized) labels, with an equivari- 

ant partial binary operation | on S (parallel composition) such that if C = C1|C2 

then C = C! if and only if C! = C{|C% for some C} and C} such that either: 
— Cı > C! and Ch = Cy, where (sup(C/) U sup(a)) N sup( C2) C sup( C1) or, 
— C! = Cı and Cy + C}, where (sup(C$) U sup(a)) N sup(C1) C sup( C2) or, 
=O > C! and G => C}, where p € {+,-—}. 

The nominal side-conditions require that any names which are fresh for the 

component to which they are introduced are fresh for the whole state. 


T. 
Parallel composition is typed using a ternary relation between types: Ti m T3 
means “T> is an arrow type from T; to T3” — there may be several arrow types 
between two types (or none). 


Definition 5. A typing system for an interaction structure (Comp, L, |) is a 
typing system (T; ({+,—} x L); —) for Comp with an equivariant ternary rela- 


T. 
tion, —, on T such that if Ty 3 T3 then for any Ci 8 Ti and C2 83 To such that 


A Curry-style Semantics of Interaction 427 


sup( C1) N sup(C2) C sup(T1), the state Ci|C2 is well-defined, has type Tz and 
satisfies the following interaction conditions: 


pl i pl j pi s Pl oy ; TS 
1. If Ci — C and Cy —> Cy then Ti > Tj and To = T; such that Tj — T3. 
2. If C2 — Ch and T3 Th (with sup(T4) Msup(T2) C sup(T3)) then Tz “> T3 


T 
such that Ti — T}. 
3. If Ci Re Ci and T3 ey T; then a £d. 


Informally (1) requires that if Cı and C2 may communicate, then this is permit- 
ted by Tı and T>, and (2) and (3) require that the observable actions of C1 |C2 
permitted by T; correspond to actions of Cy permitted by 73. Note that for any 
C\8T, and Cy)$T> there exists Ci ~7, C4 such that sup(C/)Msup(C2) C sup(Z;) 
— i.e. there are no sidechannels of communication between Cj and C2 — and 
thus C{|C is well-defined, has type T3 and satisfies the interaction conditions. 
Moreover, these are sufficient to establish that typed bisimulation is a congru- 
ence with respect to parallel composition with hiding: a result that we will apply 
to our examples in the rest of the paper. 


Proposition 1. If Ci ~r, Dı and Cy ~r, Də (and sup(C))Nsup(C2), sup(D1)N 
T; 
sup(D2) C sup(Tı)) where Tı ae! T3 then Ci|Ch ~r, Di|D2. 


Proof. We first establish the following renaming property: if C1 3 Ti — Cis Tı 
then there exists m € stab(T,) N stab(T2) N stab(73) such that C1|C2 3 Ts — 
m(C{)| C3873 — by renaming any fresh names introduced by internal transition so 
that they are also fresh for C2. Similarly, any internal reduction of C2 corresponds 
to a reduction of C,| C2, mp | o such a renaming. 


So suppose C1|C2 3 Tz —> C’ 8 T3 (an observable pou By definition 


of an interaction structure, and conditions (2) and (3), C2 3 Ta —> C3 3 T3 such 


1 


T, 

that Tı = T. By assumption, there exists D ~r, Də such that Dj 3 To => 
Dy = DY 3 T} => DY" 3 T} and Di” ~r; Cj and by the renaming property 
we may rename any fresh names in this reduction sequence to avoid clashes with 


Dı — i.e. there exists 7 € stab(T1) N stab(T2) N stab(T3) such that: 

D| D5 8 T = Di|m(D3) ZO D |m( Di") s T} => Dila(DX") 3 T}, and hence 
=1( Di )|n t (D2) 3 T ee 1(D,)|D4” as required (since bisimilarity is closed 

under permutation of internal names). 


If Ci|C2 3 T3 performs an internal action then this is either an internal 
action of Cı 3 Ti or Cy 8 To, which is similar to the observable case, or else 


Ci 2, Ci and C2 2 C3 — so that C1|C2 performs the internal action l. Then 
l pl T 

by interaction condition (1), Ti ’ Ti and T> ’ T; such that T{ —> T3. So since 

Ci ~r, Dı and C2 E Dz, there exist Di ~n, Dı and Di ~r, D2 such that 

D: 3T, => D! 3T, 25 D” 3 T! => D g T! and Dig Ta => D} 3 T, > 

Di" g Ta => DI" $ Ti gle o ~T! pi aad Cy ~r, Dy”. So using the 
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renaming property we may obtain 7 € stab(T) A stab(T2) N stab(T3) such that 
Di |D} 83 T3 => n (D}')\n (DY) 8T3 — 2( Di!) | ( DY’) 8 T3 => (Di) | (DS) 8 Ts 
as required. 


3 The Lazy Ap-calculus 


We now define a typed interaction system giving an interpretation of the (un- 
typed) lazy Ay-calculus — i.e. a direct-style CPS interpretation of lazy functional 
computation — yielding a novel, direct characterization of normal form bisim- 
ulation as typed bisimulation. This acts as a non-trivial example of a typed 
interaction system (as defined in the previous section) and a stepping stone to 
the polymorphic typing system for the same underlying language in the next sec- 
tion. First, we define an abstract machine for lazy CPS evaluation, in the form 
of a nominal LTS in which actions make explicit the calls made by a program 
to its environment. (Cf the analysis of Au-calculus by z-calculus translation in 


5l) 


Definition 6. The unnamed and named terms of the untyped Au-calculus 
are given (respectively) by the following grammars: 

t::= «| dat | tt | a.M 

M ::= [alt 


We equip the set of Ay-terms with a group action by assuming a set N of 
distinguished identifiers, partitioned into sorts (infinite subsets) of A-variables 


(x,y, z,...) and p-variables (a, 3,7...) and (for later use) type variables (X,Y, Z,...). 


The group of sort-preserving permutations on M acts pointwise on expressions 
(i.e. permuting elements of M and fixing symbols not in M). We form a nominal 
set of Ay-terms consisting of the terms in which the free variables are all in M 
and those which occur bound (by A or ju) are not, so that the support of a term 
is its set of free variables. 

Based on this syntax, we define the sets of expressions (control terms) which 
determine the next transition of our abstract machine. 


Definition 7. Control terms are given by the grammar: A: =M|V |K |e 


— M ranges over the set of Au programs (named terms) — i.e. M ::= [alt. 

— V ranges over the set of Au values (\-abstractions) — i.e. V ::= Ax.t. 

— K ranges over the set of Au continuations (named contexts with a single hole 
at head position) — i.e. Ke] ::= [ale | Klef]. 

— eis the empty context. 


As above we form a nominal set of control terms in which the support of each 
element is its set of free variables. 


Definition 8. An environment is a sort-respecting finite partial function E from 
N into the nominal sets of unnamed Ay-terms and continuations. The nominal 
set of environments has the G-action: (m-€)(a) = 7- (E(m7~! - a)). 
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Direct-style CPS evaluation of a program in an environment proceeds as follows: 


— A variable inside a continuation (E; K[x]) fetches the term bound to x and 
names it with a fresh pi-variable which is bound to K. 

— A -redex inside a continuation (£; K[Az.ts]) binds s to a fresh A-variable y 
and K to a fresh p-variable a and evaluates [a]t[y/z]. 

— A p-abstraction inside a continuation (E£; K[uwa.M) binds K to 6 and eval- 
uates M[8/a]. 

— A named value (E; [a] V) calls the continuation bound to a with V. 


These transitions are labelled with actions of the form a(b), where a is the 
variable called (if any) and T are the fresh variables created (if any). Except 
for p-abstraction reduction, each of these evaluation rules decomposes into a 
complementary pair of input and output rules corresponding to the behaviour 
of the active (or “positive”) part of the program and, a passive (or “negative” 
part). This decomposition is made precise in Definition [10] (parallel composition 
for configurations). 


Definition 9. The nominal labelled transition system Comp),, is defined: 


— States are pairs (E; A), where E is an environment and A is a control term. 
— The set of actions is LU ({+,—} x £), where L is the nominal set of labels 


U {a,2(a), (a, 2), (a)} 


r,AEN) XN, 


— The transitions are given in Table 1. By convention, a variable name men- 
tioned on the right of a rule but not the left is assumed not to occur there. 


The polarity of a state is positive if the control term is a program or continuation, 
and negative if it is a value or the empty context (we write V, for a passive term 
of either kind). Unpolarized transitions send positive states to positive states. 
Except for pi-abstraction reduction, each corresponds to complementary, positive 
and negative transitions, which send positive states to negative states and vice- 
versa. 


(Ela > K]; [a] Ve) + (E; K[Va]) 

(E; K[(Ax.s)t])  “Y (€,(y > t), (a4 K); [a]sly/2]) 

(Ele t; Kl]) Z3 (€,(a4 K); lalt) 

(E;K[ua.M])  $® (E, (8 4 K); M[B/a)) 
(E; [a] Ve) #5 (E; Va) (Ela > K]; Ve) = (E; K[V.]) 
(E; K[ot]) EP (E, (H t), (a K); e) (E; Azt) LY (€; [olt[y/a) 
(E;K[a]) EY (E, (am K); e) (Elev te) EY (E lalt) 


Table 1: Abstract machine for CPS evaluation of lazy Ay-calculus 
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(GAL. fAv.2) (+ [a] è Ay-y) 
—(g,B) +(9,8) 
(= [8lgAx.x) ((B 4 [ale), (g > Ayy); °) 
+g(7) —g(7) 
(Q= [B]e Ax-1); e) (B= lale), (g > Aya): Aw) 
(y+ [Ble Xxx); [Ble Ax.x) ((8 > [a]e), (g= Ay-y); Ayy) 
+(z,6) —(z,6) 
(+9 [Blo A22), (2 > Ana), (8 > [8]e); e) (E> fale), (g 1> Da. 162) 


(Q [B]e Azz), (z > A-a), (8 => [B]e);[elAw-x) ((8 => [a]e), (g > Aya), (e [ô]e); e) 


(y= [B]e Axx), (z H hea), (8 = [6]e); Az.£) (8 = [ale), (g = Wwa): (c+ [ð]e); [ð]e) 


(y= [B]e Ars), (z > Arya), (5+ [6]e); [B]Ax.£) ((8 = [ale), (g > Awa), (e [ő]e); 0) 
i bs 


(y+ [Ble Ax.x), (z + wz), (5+ [B]e); Av.) ((8 +> [a]e), (g => Ay); (e+ [4]e); [a]e) 


Fig. 1: Example traces evaluating [a](Af.f Ax.x) Ay-y 


To define an interaction structure on Comp,,, (Definition 4) we require a 
parallel composition operation on configurations. 


Definition 10. /Parallel Composition] On control terms, let | be the (least) par- 
tial operation such that Aļe = e| A = A and K|V = V|K = K[Vv]. 

Given configurations Ci = (E1; A1) and Cy = (E2; A2) let CiļC2 (E U 
E2; Ail A2), provided dom(E) N dom(E) = Ø and A| Ao is well-defined. (C1| C2 is 
undefined, otherwise. ) 


By inspection of the transitions in Table 1, we may see that C1|C2 has precisely 
the transitions of Cı or Cz (provided any fresh names are fresh for C4|C2), 
together with internal transitions arising from communication between Cı and 
C2. Therefore we have an interaction structure according to Definition [f] Figure 
1 gives an illustrative example: the evaluation of [a] (A f.f Ax.x) Ay.y — which is 
the parallel composition (Af. f Ax.x)|([a] è Ay.y) — to fa]àz.zx. 


3.1 A Typing System 


We now define a basic typing system for configurations which records minimal 
information about the control term (whether it is a program, value, continuation 
or empty context) but captures a more significant property of environments — 
acyclicity. This has practical relevance for memory management, but its imme- 
diate significance is that the second order typing in the next section relies on 
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the fact that an acyclic environment may be contracted into a valuation by it- 
eratively replacing variables bound in the environment until none occur as free 
variables. 


Definition 11. Given a nominal environment E, define the binary relation on 
N:a<e b if a € sup(E(b)) and let <~ be its transitive closure. Say that E is 
a pre-valuation (i.e. acyclic) if this is a strict partial order — i.e. a &* a for 
alla E N. E is a valuation if <ge=<«z — i.e. sup(E(a)) Ndom(E) = Ø for all 
a € dom(E). 


We assume a closure operation which takes an expression e and pre-valuation 
E to an expression E€(e) obtained by replacing each atom a € dom(€) with 
E(a) in e, having the property that sup(E(e)) O dom(€) = Uf{sup(E(a)) | a € 
sup(e) N dom(E)}. 


Lemma 1. For any pre-valuation E there is a unique valuation E* such that 
E*(E(e)) = E*(e) for all expressions e. 


Proof. Defining EŻ by E+! (a) = E*(E(a)), the EŻ form a chain of pre-evaluations 
such that the «e downward closure of J{sup(€*(a)) N dom(E) | a € dom(E€)} 
is empty or strictly decreasing, and thus is empty for some k — ie. EF is a 
pre-valuation and thus €*(€(a)) = €(€*(a)) = E*(a) for all a € dom(€), and so 
E*(E(e)) = E* (e) for all expressions e. If E*(e) = E*(E(e)) for all expressions e, 
then E*(e) = E*(E*(e)) = E*(e) for all e. 


Definition 12. The basic types for control terms are tuples I + +; A where 
T © {T,L} and I,A are non-repeating sequences — i.e. totally ordered finite 
sets — of À and u variables in N, respectively. 

A control term A is well-typed with T- Tr; A if FV(A) CLUA andr =T 
if and only if A is a value or continuation. Basic types form a nominal set with 
the evident pointwise G'-action. 


Configurations are typed with polarized versions of these types. Given a polar- 
ized context (non-repeating sequence of polarized variables) [ = p121,...,Dn2n 
we write |I| for the unpolarized context 21,...,%n, T for the polarized con- 
text Prti, .-., Pnn, and I’? for the (unpolarized) restriction of I to p-polarized 
elements. 


Definition 13. The nominal LTS Ty, of basic Au configuration types: 


— States are polarized configuration types — triples I. + pr; A, where pr € 
{+,—}x{T, L} and I and A are polarized contexts of A and p variables in 
N 


— Actions are the polarized actions of Comp), — Obs = {+,-} x £ 
— Transitions are given by the rules in Table [A 


We now define a typing relation from configurations to types. Let I’ be a polar- 
ized context. A pre-valuation for I is a pre-valuation € such that P C dom(£), 
sup(E(a)) C dom(€) UI for every a € dom(€), and if a,b € I and a &«% b then 
a <r b. Observe that if € is a pre-valuation for I, then €* is a valuation for I 
such that for all a € +, FV(E*(a)) C T5. 
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FrpT;A da T, px - pl; A, pa 
Pipa] pl; A P9 Pe pls A, pa 


Ch pT; Appa] S Pr pl;A 
tpl; Apa] S Pr pT; A 


Table 2: Transitions of basic configuration types 


Definition 14 (Au Typing Relation). (£; A) 3 (IF pr; A) if pol(€;A) = p 
and E is a pre-valuation for TU A such that ITT + €*(A):7;A7, and for each 
veélt, ITF E*(x):T; A and each a € At, IT F E* (a): T; A7. 


It is straightforward to check that this satisfies the subject reduction properties 
and thus defines a type system for Comp),,. 


Remark 1. We may apply a second constraint via our type system: local control 
flow — that continuations are called according to a LIFO discipline and thus 
may be stored on a stack (in game semantic terms, the well-bracketing condition). 
Evaluation of A-terms by internal (and positive) transitions naturally satisfies 
this property — we can use types to ensure that the environment also does so. 


Definition 15. A configuration type I F pt; A satisfies the local control condi- 
tion if the polarities of -variables in A are alternating, and the polarity of the 
last element of A (if any) is D. 


Transitions for local control types are given by refining the rules for calling a 
continuation to enforce stack discipline: 

Db pT; A, pa > Pr pL; A 

Db pl; A, pa > PE pT; A 


Subject reduction holds with respect to A-configurations (in which the con- 
trol term, and all terms and continuations in the environment, contain no p- 
abstractions). 


3.2 A Typed Interaction Structure 


We now define an arrow relation, allowing a characterization of parallel composi- 
tion with hiding for acyclic configurations. (Acyclicity is not preserved by union 
of environments in general, so the typing rules give a useful way of identifying 
pairs of configurations for which it does hold.) 


Definition 16. The arrow relation on configurations T; = T; F pri; A; is de- 


T: T. A pr 
fined pointwise — Tı =3 Ts if Ty n I3, Ay 2 A3, and pri o pT3 — where 


x 
— For any polarized contexts, X1 3 X3 if X74, and X3 have disjoint underlying 
sets of elements and Xə is an interleaving of Xı and X3. 
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— pr Z pts iff pt, = —L and Pro = prz or pT3 = +L and pro = pry. 
It remains to show that this satisfies Definition 5] 
Proposition 2. (Ty),,,—©) is a well-defined typing system for (Comp; p, |). 


Proof. Given C4 = (E1; A1) and Cy = (E2; A2), suppose Ci : Ti, C2 : To and 
sup( C1) N sup( C2) C sup(T1) = |M] U | Ai: 


— A,|Az is well-defined, and has type 73, since either A, : —L (i.e. Ay =e) and 
so A| A2 : T2, or Ay and Az have complementary types, and so A| Az : +L 
(i.e. they are a term and context which fit together to give a program). 

— E U £ is a pre-valuation, since the directed graph (X¢, U <e,) is acyclic. 
(Any cycle in this graph would have to contain vertices from both <¢, and 
<e,, since both fragments are acyclic. Any path which enters and leaves one 
fragment must begin and end on points which are ordered by I U A and so 
composing such paths cannot lead to a cycle.) 


Moreover, it is straightforward to verify that the interaction conditions are sat- 
isfied and that we therefore have a typed interaction structure. 


Thus, by Proposition [I] typed bisimilarity is preserved by parallel composition 
plus hiding. 


T: 
Proposition 3. If Cy es Dy, Cy mar Də and Tı = T3 then C1| C2 ITa Dı|Də. 


It immediately follows that (for example) bisimilarity of values is preserved by 
placing them inside the same continuation — i.e. if (_;v) and (_; v’) are bisimilar 
at type I H —T;A then (-; K[v]) and (-; K[v’]) are bisimilar at type F H +L; A. 
Moreover, if typed bisimilarity is extended to an equivalence on all Au-terms — 
s ~ra tif (5 [a]s) ~-rr+1:-4,-a (5 [alt), for a ¢ A — we may use Proposition 
[3] to show that if s ~p,, t then for any compatible context, C[t] ~r;a C[t’]. 


4 A Polymorphic Type System 


In this section we describe a more restrictive and informative typing system for 
the interaction structure of Au configurations. This yields a model of the lazy 
Ap2-calculus — i.e. lazy Ap-calculus with polymorphic (second-order) Curry- 
style typing, which we now describe. 

In order to fit such a type system to a semantics of lazy evaluation to weak 
head-normal form, we combine A-abstraction and application with abstraction 
and instantiation of finite sequences of type variables — i.e. function types take 
the form V(X, ...X;,).0 > T, where X1... Xn is a finite, non-repeating sequence 
of type variables. The judgments © F 7 (7 is a well-formed type over the context 
of type-variables ©) are derived according to the rules: 


O,X4,..,Xnko O,X41,....Xnbr 
O,X,0'FX OFV(X1...Xn).09T 


434 J. Laird 


Typing judgments are given with respect to an equational context (finite 
sequence of equations between types). These contexts play a key role in defining 
states in our LTS of types — they record constraints that type-instantiations 
must satisfy. For example, if a continuation K (with a hole) of type ø is called 
with an argument v of type T then the type variables in ø and 7 must have been 
instantiated so as to make these types equal. Formally, we define the judgment 
OF & (£ is a well-formed equational context over ©) as follows: 


O-FE Oba _ OFr 
OF. OFS ,o=T 


Type equality judgments with respect to an equational context, of the form 
0;5 + o=7T (where OF 2,0,7) are derived according to the rules: 


O;Sbtp=t O;5bo=7 
0;8|c=T|ko=T O;S+r=r 0; p=a0 


OBL 031 =X o'r! OBL oT =VX o'r @,X Eho=o! @,Xbr=r! 
O, X ;Eho=o' O,Xbr=r! O;ELWWX .0o>r=VX 0! 37! 


A valuation VY for O satisfies an equational context O F o1 = T1, ...,On = Tn if 
V(o;) = V(T;) for each i < n. 


Lemma 2. 90;2 F o = T if and only if for all valuations V which satisfy =, 


V(o) = V(r). 


A p2 type-in-context is a tuple O; £; I + 7; A, where O is a context of type 
variables and = is an equational context, T is a Au2-type (or L) and I and A 
are (respectively) sequences of A-variables and s-variables and their types (all 
over ©). Assigning this type to a term may be understood as asserting that 
“for any valuation V of the type-variables in O which satisfies =, the judgement 
V(I) Ft: V(r); V(A) is valid”. So, for example, X,Y; Y = X > X;_+ Awa: Y; 
is derivable according to the rules in Table |3| Note that there are no rules for 
introducing or discharging equational assumptions — they will be generated by 
the transitions of the LTS — so the terms of type O; -; I F t: T; A are precisely 
those derivable in second-order A-calulus without type equality judgments. 


O;2;Ctt:0;A  O;5bo=r O,X1:k,..,Xniknj Sl tot: TrA 
O;8;0[x:t]Fa:7;A O;Z TFt A 0;2;rFAz.t:YX1...Xn.(o—>rT); A 


0;2;rFt:VX1...Xn.0 OT; A OF p1,...,pn O;2;Cbs:0[91/X1...pn/Xn];A 
O38; Fts:7[p1/X1...pn/Xn|;A 


0;2;D+t:7;Ala:7] 0;2;T-M:1;A,a:7r 
0;2;rH[a]t: L;A 0;2;TFur.M:T;A 


Table 3: Typing Judgments for the lazy Ay2-Calculus 
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4.1 Second-Order Configuration Types 


We now define a second-order typing system for the interaction structure Comp), 
of Au configurations. Its states (second-order configuration types) capture the 
totality of information about the types of the control term and environment, 
and the instantiations for type variables by both a program and its environment, 
which may be inferred by an external observer of their interaction. 


Definition 17. A second-order configuration type is a polarized Au2 type-in- 
contert — a tuple O;2;I;+ pr; A, where O is a polarized context of type- 
variables, and = is a polarized equational context, I and A are polarized contexts 
of typed X and u variables and pr is a polarized Au2-type (or L), all over O . 


We place a further constraint — “polarized satisfiability” — on the configuration 
types which are permitted as states. This requires that their equational contexts 
can actually be satisfied by a program and environment successively instanti- 
ating type variables quantified positively and negatively (respectively), without 
knowing the types instantiated by the counterparty. 


Definition 18. A pre-valuation V for a polarized context of type variables O 
positively satisfies the polarized equational context O + Z (written V Fo E 
if for any pre-valuation W for ©, the first formula in Z not satisfied by the 
valuation (VUW)* for |O| (if any) is negative. OF E is (polarized) satisfiable 
if EHF © and Ot E are both positively satisfiable. Note that this implies that 
the underlying context |O| F |E] is satisfiable. 


Determining whether a polarized context is satisfiable is equivalent to a series 
of conditional (first-order) unification problems: these can be solved using the 
algorithm for first-order unification [II]. We place an equivalence relation on con- 
figuration types (cf. structural congruence of processes), allowing the principal 
type to be replaced by any of the (finitely many) types to which it is equivalent 
under £. 


Definition 19. (9; £; I F pr; A) = (O; E; r F pr’; A) ifO;2F r=’. 
The (bipartite, nominal) LTS Ty),,. of Au2 is defined: 
— States are ~-classes of satisfiable configuration types O; £; I F pr; A. 


— Actions are polarized actions of Comp,,,: Obs = {+, —} x £. 
— Transitions are given by the rules in Table 


To define a typing relation between configurations and Aj2-configuration 
types, we first define typing judgements O; £; I H A: 7; A for control terms. In 
the case of programs and values, these are as derived according to the rules in 
Table [3] For continuations, the rules 


O;5;T-K:1[p1/X1...en/Xn|;A 0;5;T's:0[p1/X1...en/Xn] 
0;2;DHale:7;Ala:7] 0;2;rHK[6 s]:VX1...Xn.0 >r; A 


are equivalent to typing O; £; I F K : 7; A if O;5;I,e:7+ Kile]: L;A. The 
empty context has type L in any well-formed context. 
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OET HWX ...Xno r A oS O,pX1,...,pXn; E Tr pro F pl; A, paT 


9; =; T [px : rt]; pl B 0:5; FPL; A pa: T 
O; Xi; I F pL; Alpa : 7] eS O:5; Tr FPT; 
0; =Z; IT F po; Alpa : 7] a 90; 2,plo = T); T F pL; A 


Table 4: Transitions of second-order configuration types 


Definition 20 (Typing Relation). Let V be a valuation for O which positively 
satisfies Z, and define V F (E; A) 3 O;Z; I F pr;A if E is a pre-valuation 
for T, A, such that O7; V(=7); V(I) F E*(A)) : V(r); V(AT) and for each 
x:0o € Dt, O7; V(E=7); V(I) F E*(x) : Vio); V(AT) and each a : o € AF, 
O7; V(E7);V(LP7) F E* (a) : V(o); V(47). 

Let C 3 T if there exists a valuation V for O such that VE C 8T. 


Note that if C 3 T and T =~ T’ then C } T’, so typing is a well-defined relation 
from configurations to equivalence classes of configuration types. 


Proposition 4. (Comp,, 3 T satisfies the subject reduction property. 
P Aw © 'Ydp2 


Proof. For the observable transitions, this is a straightforward observation that 
the typing relation is preserved. For internal transitions (specifically, 6 reduc- 
tions), we use the corresponding subject reduction property for Au2 substitu- 
tions — ie. if O; £; I F K[\zx.ts] : L; A then O; £; I + K{[t[s/a]] : 1; A and if 
0;25;0F K[pa.t]: L; A then O; £; r F t|K/a] : L;A. 


Figure 2 gives an example illustrating the role of types in constraining be- 
haviour: a trace of the value Af.fv 3 3X.X, where v is an arbitrary typable 
value (recall that IX.X = VY.(VX.X — Y) — Y). Observe that there are 
no transitions from the the final state — a call to y is not possible because 
—-Y,+X + —(Y’ = X’) is not negatively satisfiable. In fact, the tree of transi- 
tions of 1X.X branches only on negative transitions (i.e. Opponent moves). It 
follows that any configuration of this type will have the same set of transitions, 
and that therefore Af. f Axy.x ~3x.x Af.f Axy-y as proposed in the introduction. 


4.2 A Second-Order Typed Interaction Structure 


It remains to prove that Ty), is a well-defined typing system for the interaction 

structure on Comp),,,, and that typed bisimulation is therefore a congruence. We 

need to establish that the pointwise extension of the arrow relation (Definition 
T: o = 

to second-order configuration types (i.e. Ty = T; if Oi = O3, 51 = 53, 
A Pr : set oe 

T => IT, Ay m: A3, and pri o pra) satisfies the conditions of Definition |5 


T2 
— that if Ci = (E1; A1) s Tı and Cy = (E2; A2) 8 Th, where T) — T; and 
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(3Af-f-v) 8 (35-F -(VX.X > Y) > Y;-) 
— (9,0) 
(slalgv) 3 (-Y’; ;-g: VX.X > Y’F4+1;-a:Y’) 
+9(8) 
((B 4 [a] è v); e) (-Y’;-;-g: VX.X > Y' F -L;-a:Y',+8:VX.X > Y' 
-8 


((8 + [a] e v); [a] e v) 3 (—Y';-; -g : VX.X 4 Y' H +YX.X > Y';—a: Y’) 
+(2,9) 


((8 = [a] e v), (z = v), (y = [a]e); e) s (-Y’, +X"; 5; -g: VX.X > Y', +z: X’F-1;-a:¥',+7:Y") 
—2(5) 


y 
((B + [a] è v), (z = v), (y > [a]e); [5]v) 8 (—Y’, +X"; -; -g : VX.X > Y', +z: X' F +L;-a: Y',+y:Y',—ô: X') 
+6 


((B = [a] è v), (z+ v), (y > [a]e); v) 8 (-Y¥’, 4X55 -g : YX.X > Y', 42: X' H -X’;-a: Y', +7: Y’) 


Fig. 2: Trace of Af.f v : 3X.X 


sup(C1)Nsup( C2) C sup(Tı), then C1|C% is well-defined, has type T3 and satisfies 
the interaction conditions. 

By Proposition [2] 2| C1|C2 = (E1 U E2; A1| A2) is a well-defined configuration, 
and € £ £4 U& is a pre-valuation for T} U A3. By the assumption that C1 3 T} 
and C$ T, there are valuations Vı F Cı 3T, and V2 E C238 Tə. Then V £ VUV 
is a pre-valuation for ©3. To show that V* F C,|C2 3 T3, we need to verify that: 


Lemma 3. V positively satisfies 53. 


Proof. Let W be a pre-valuation for O3. The first formula in = (if any) which 
is not satisfied by V U W = Vı U V2 UW cannnot be positive in =, (positively 
satisfied by Vı) nor in => (positively satisfied by V2), and so must be a negative 
formula in £3. 


Lemma 4. 03; V*(53);V*(I3_) F E*(Ai|Az2) : V(T); V* (As ) 


Proof. Observe that €* = (Ef - Ef)’ and V = (V2 - Vj)" for some i < n. Hence, 
it suffices to prove by induction on i that O2; (V2 - V1} (22); V2 : V1) (Tz ) F 
(E3 - Ef)"(Ai|A2); V2 - V1 Y (47). 

Similarly, each term and continuation assigned to an output variable is well- 
typed under closure by V* and €* and thus: 

Proposition 5. C1|C2 3 T3. 


It remains to show that the interaction conditions a Definition Blare satisfied. 
The a is e condition 1 — that if Cı as Ci and C2 2 C3 then 


T 
Tı 4 Ti and T> & T} such that TÍ —c T3. This requires some further investiga- 
tion of configuration types. 
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The interesting cases are those where A; = Aw.t and A2 = K [es] (or vice- 
versa) and so they can perform the complementary actions —(y,a) and +(y, a). 
We need to show that |01|; |£1| F T is non-atomic — that is, |O1|;|5i| F T = 
VX1...Xm.p > o — for some p,o. Observe that this implies that |O2]; |Z] 7 
is also non-atomic (since = contains the equations in £1) so that T; and Tp can 
perform the complementary actions —(y,a) and +(y,a). 

Since any derivation of a typing judgement for Ax.t or K[es] must conclude 
with —-introduction followed by applications of the type-equality rule we have: 


Lemma 5. [f0;2;0 F Awt:7;A or O;2;0 + Kles|:7;A thenO;2+ 7 is 


non-atomic. 


Hence, by the assumption that (E1; Ax.t) 8 (O1; £2; T1 F —7; A1) and (E2; K[et])8 
(O2; Z2; I2 F +7; A) we know that 01;Vi(51) F Vi(r) and O2; V3 (£2) F V3 (T) 
are non-atomic. From the latter we may infer that O1; Vž (21) F Vš (T) is non- 
atomic, since O and £ are interleavings of O; and = with the disjoint contexts 
O3 and 53. 

So to show that |91|; |Z1| F 7 is non-atomic is it is sufficient to prove the 
contrapositive. 


Lemma 6. Suppose V} Fe E and V_ Fe Z, where |O|;|&| + 7 is atomic. Then 
either O7; V4 (E£) + Vi(r) or OF; V_(E) H V_(r) is atomic. 


Proof. We extend the grammar of types with an unbounded set of “neutral 
atoms” A, B,C,..., which are equal only if syntactically identical, and prove the 
lemma for this extended set of types by an outer induction on the size of O, and 
an inner induction on the sum of the lengths of the types in =. 

At least one of Vi(7) and V_(r) must be atomic and so if 5 is empty then 
the hypothesis holds. Otherwise, = = p(o = o’), =’ for some types o,o’ and 
equational context =’ over O, and polarity p € {+, —}. 

If o and o’ are both non-atomic, then by satisfiability o = VX ,...Xy.p1 > 
p2 and o = VX,...Xn.p > ph for some p1, p2, p4, ph. Letting Ai,...,An be 
fresh, distinct atomic types, define p = p[Ai/X1,...,An/Xn]. The equational 
context Z” = pA = A’), p(p2 = A), Z is equivalent to (satisfied by the 
same valuations as) =, and so O; Æ" + + is atomic, and positively and neg- 
atively satisfied by V} and V_. Hence, by inner induction hypothesis, one of 
6 3V.(8")F Vale) or O7; V-(2") Py) is atomic. 

Otherwise at least one of o and o’ is atomic. If o = o’, then we may discard 
the tautology o = o’ and apply the (inner) inductive hypothesis to O;.5" F r. 
Otherwise at least one of c, o’ must be a type-variable with polarity p in O (none 
of the other cases are p-satisfiable). So assume without loss of generality that 
O = 0',pX, 0” and = = p(o = X), =’. We may show that: 

— 0,0"; E'[o/X] + T[o/X] is atomic. 

— 0,0" + 5"a/X] is positively satisfied by V, and negatively satisfied by V_. 
So by the outer inductive hypothesis, either (O’,O”)~; V} (S[o/X]) F Vy(r) or 
(0, 0”)*; V_(2[o/X])  V_(r) is atomic, and hence either O7; V} (£) F Vy(r) 
or OF; V (Z) F V_(r) is atomic. 
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We have shown that the arrow relation satisfies the first interaction condition. 
2 and 3 are straightforward to verify, establishing that (Comp),,2 8 Ty),2) is 
a well-defined typed interaction structure. Therefore, by Proposition |1| typed 
bisimulation is preserved by parallel composition plus hiding, and thus: 


Theorem 1. Typed bisimulation is a congruence for the Au2-calculus. 


5 Conclusions and Further Directions 


We have described a “Curry-style” approach to game semantics, and used it to 
give new models of polymorphism. Various existing models may also be framed as 
typed interaction systems, such as the semantics of call-by-value in [12]. Nor are 
instances restricted to operational game semantics: for example we can present 
linear combinatory algebras of games and strategies in this way, and poten- 
tially other models of concurrent interaction. Unlike basic Church-style game 
semantics, these models give the opportunity to make finer distinctions between 
programs based on internal behaviour, which we have not explored here. 

The notion of typed interaction structure reflects only limited structure of our 
models, but may be developed further. Having characterized parallel composition 
plus hiding within this setting, a natural next step would be a notion of copycat 
strategy, leading to structure for sharing and discarding information. One goal 
for such a development would be to put the generalization of congruence from 
configurations to terms on a systematic footing. 

In another direction, our models of polymorphism may be developed further. 
In particular combining and fully exploiting generic and abstract data types 
often requires higher-order polymorphism, in which quantifiers range over type 
operators (functions which take types as arguments and return them as values). 
Whereas this is difficult to represent in game semantics, our model readily ex- 
tends to a typing system based on System F,,, which allows quantification over 
type-operators: the price to pay is that satisfiability of configuration types (and 
thus effective presentation of the states of our LTS) requires the solution of 
higher-order unification problems, which are undecidable, in general. 
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Abstract. Undoing computations of a concurrent system is beneficial in 
many situations, e.g., in reversible debugging of multi-threaded programs 
and in recovery from errors due to optimistic execution in parallel dis- 
crete event simulation. A number of approaches have been proposed for 
how to reverse formal models of concurrent computation including pro- 
cess calculi such as CCS, languages like Erlang, prime event structures 
and occurrence nets. However it has not been settled what properties a 
reversible system should enjoy, nor how the various properties that have 
been suggested, such as the parabolic lemma and the causal-consistency 
property, are related. We contribute to a solution to these issues by using 
a generic labelled transition system equipped with a relation capturing 
whether transitions are independent to explore the implications between 
these properties. In particular, we show how they are derivable from a 
set of axioms. Our intention is that when establishing properties of some 
formalism it will be easier to verify the axioms rather than proving prop- 
erties such as the parabolic lemma directly. We also introduce two new 
notions related to causal consistent reversibility, namely causal safety 
and causal liveness, and show that they are derivable from our axioms. 


Keywords: Reversible Computation, Labelled Transition System with 
Independence, Causal Safety, Causal Liveness 


1 Introduction 


Reversible computing studies computations which can proceed both in the stan- 
dard, forward direction, and backward, going back to past states. Reversible 
computation has attracted interest due to its applications in areas as different as 
low-power computing [15], simulation [4], robotics [21], biological modelling [31] 
and debugging [23]. 
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There is widespread agreement in the literature about what properties char- 
acterise reversible computation in the sequential setting. Thus in reversible fi- 
nite state automata [32], reversible cellular automata [13], reversible Turing ma- 
chines [2] and reversible programming languages such as Janus [35] the main 
point is that the mapping from inputs to outputs is injective, and the reverse 
computation is deterministic. 

Matters are less clear when it comes to reversible computation in the con- 
current setting. Indeed, various reversible concurrent models have been studied, 
most notably in the areas of process calculi [6,29,18], event structures [34], Petri 
nets [1,25] and programming languages such as Erlang [20]. 

A main result of this line of research is that the notion of reversibility most 
suited for concurrent systems is causal-consistent reversibility (other notions 
are also used, e.g., to model biological systems [31]). According to an informal 
account of causal-consistent reversibility, any action can be undone provided 
that its consequences, if any, are undone beforehand. Following [6] this account 
is formalised using the notion of causal equivalent traces: two traces are causal 
equivalent if and only if they only differ for swapping independent actions, and 
inserting or removing pairs of an action and its reverse. According to [6, Section 3] 


Backtracking an event is possible when and only when a causally equiv- 
alent trace would have brought this event as the last one 


which is then formalised as the so called causal consistency (CC) [6, Theorem 1], 
stating that coinitial computations are causal equivalent if and only if they are 
cofinal. Our new proof of CC (Proposition 3.6) shows that it holds in essentially 
any reversible formalism satisfying the Loop Lemma and the Parabolic Lemma, 
and we believe that CC is insufficient on its own to capture the informal notion. 

A formalisation closer to the informal statement above is provided in [20, 
Corollary 22], stating that a forward transition t can be undone after a derivation 
iff all its consequences, if any, are undone beforehand. We are not aware of other 
discussions trying to formalise such a notion, except for [30], in the setting of 
reversible event structures. In [30], a reversible event structure is cause-respecting 
if an event cannot be reversed until all events it has caused have also been 
reversed; it is causal if it is cause-respecting and a reversible event can be reversed 
if all events it has caused have been reversed [30, Definition 3.34]. 

We provide (Section 4) a novel definition of the idea above, composed by: 


Causal Safety (CS): an action cannot be reversed until any actions caused by 
it have been reversed; 

Causal Liveness (CL): we should allow actions to reverse in any order com- 
patible with CS, not necessarily the exact inverse of the forward order. 


We shall see that CC does not capture the same property as CS+CL (Exam- 
ples 4.15, 4.37), and that there are slightly different versions of CS and CL, 
which can all be proved under a small set of reasonable assumptions. 

The main aim of this paper is to take an abstract model, namely labelled 
transition systems with independence equipped with reverse transitions (Sec- 
tion 2), and to show that the properties above (as well as others) can be derived 
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Acronym Name Defined in|Proved in using 
SP Square Property Def. 3.1 Axiom - 
BTI Backward Transitions are Independent | Def. 3.1 Axiom - 
WF Well-Founded Def. 3.1 Axiom - 
CPI Coinitial Propagation of Independence | Def. 4.2 Axiom - 
| IRE | | Independence Respects Events | Def. 4.12 | Axiom | =  - ss 
CIRE  |Coinitial Independence Respects Events| Def. 4.29 Axiom implied by IRE 
IEC Independence of Events is Coinitial Def. 4.16 Axiom - 
PL Parabolic Lemma Def. 3.3. | Prop. 3.4 BTI, SP 
CC Causal Consistency Def. 3.5 | Prop. 3.6 WF, PL 
UT Unique Transition Def. 3.7 Cor. 3.8 CC 
ID Independence of Diamonds Def. 4.6 | Prop. 4.7 BTI, CPI 
RPI Reversing Preserves Independence Def. 4.17 | Prop. 4.18 SP, CPI, IRE, IEC 
CS Causal Safety Def. 4.11 | Thm. 4.13) SP, BTI, WF, CPI, IRE 
CL Causal Liveness Def. 4.11 | Thm. 4.14) SP, BTI, WF, CPI, IRE 
CS< ordered Causal Safety Def. 4.24 | Prop. 4.39 | SP, BTI, WF, CPI, NRE 
CL< ordered Causal Liveness Def. 4.24 | Prop. 4.39 |SP, BTI, WF, CPI, CIRE 
CSc coinitial Causal Safety Def. 4.27 | Thm. 4.28 SP, BTI, WF, CPI 
CLai coinitial Causal Liveness Def. 4.27 | Thm. 4.30 ISP, BTI, WF, CPI, CIRE 
NRE No Repeated Events Def. 4.35 | Prop. 4.42 |SP, BTI, WF, CPI, CIRE 
RED Reverse Event Determinism Def. 4.40 | Prop. 4.41 | SP, BTI, WF, CPI, NRE 


Table 1. Axioms and properties for causal reversibility. 


from a small set of simple axioms (Sections 3, 4, 5). This is in sharp contrast 
with the large part of works in the literature, which consider specific frameworks 
such as CCS [6], CCS with broadcast [26], CCB [14], -calculus [5], higher-order 
m [18], Klaim [11], Petri nets [25], wOz [22] and Erlang [20], and all give similar 
but formally unrelated proofs of the same main results. Such proofs will become 
instances of our general results. More precisely, our axioms will: 


— exclude behaviours which are not compatible with causal-consistent reversibil- 
ity (as we will discuss shortly); 

— allow us to derive the main properties of reversible calculi which have been 
studied in the literature, such as CC (Proposition 3.6); 

— hold for a number of reversible calculi which have been proposed, such as 
RCCS [6] and reversible Erlang [20] (Section 6). 


Thus, when defining a new reversible formalism, one just has to check whether 
the axioms hold, and get for free the proofs of the most relevant properties. 
Notably, the axioms are normally easier to prove than the properties, hence the 
assessment of a reversible calculus gets much simpler. 
As a reference, Table 1 lists the axioms and properties used in this paper. 
In order to understand which kinds of behaviours are incompatible with a 
causal-consistent reversible setting, consider the following LTSs in CCS: 


a.0 & 0, b.0 ~. 0: from state 0 one does not know whether to go back to a.0 or 
to b.0; 
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a.0 + b.0 Ż 0, a.0 + b.0 Ż, 0: as above, but starting from the same process, 
hence showing that it is not enough to remember the initial configuration; 

P +, P where P = a.P: one can go back forever, against the idea that a state 
models a process reachable after a finite computation. 


We remark that all such behaviours are perfectly reasonable in CCS, and they 
are dealt with in the reversible setting by adding history information about past 
actions. For example, in the first case one could remember the initial state, in 
the second case both the initial state and the action taken, and in the last case 
the number of iterations that have been performed. 

Due to space constraints, some proofs and additional results can only be 
found in the companion technical report [16]. 


2 Labelled Transition Systems with Independence 


We want to study reversibility in a setting as general as possible. Thus, we 
base on the core of the notion of labelled transition system with independence 
(LTSI) [33, Definition 3.7]. However, while [33] requires a number of axioms on 
LTSI, we take the basic definition and explore what can be done by adding or 
not adding various axioms. Also, we extend LTSI with reverse transitions, since 
we study reversible systems. We define first labelled transition systems (LTSs). 

We consider the LTS of the entire set of processes in a calculus, rather than 
the transition graph of a particular process and its derivatives, hence we do not 
fix an initial state. 


Definition 2.1. A labelled transition system (LTS) is a structure (Proc, Lab, —), 
where Proc is the set of states (or processes), Lab is the set of action labels and 
— C Proc x Lab x Proc is a transition relation. 


We let P,Q,... range over processes, a,b, c,... range over labels, and t,u,v,... 
range over transitions. We can write t : P 5 Q to denote that t = (P, a, Q). We 
call a-transition a transition with label a. 


Definition 2.2 (LTS with independence). We say that (Proc, Lab, —, +) is 
an LTS with independence (LTSI) if (Proc,Lab,—) is an LTS and ı is an ir- 
reflexive symmetric binary relation on transitions. 


In many cases (see Section 6), the notion of independence coincides with the 
notion of concurrency. However, this is not always the case. Indeed, concur- 
rency implies that transitions are independent since they happen in different 
processses, but transitions taken by the same process can be independent as 
well. Think, for instance, of a reactive process that may react in any order to 
two events arriving at the same time, and the final result does not depend on 
the order of reactions. 

We shall assume that all transitions are reversible, so that the Loop Lemma [6, 
Lemma 6] holds. This does not hold in models of reversibility with control mech- 
anisms such as irreversible actions [6,7] or a rollback operator [17]. Nevertheless, 
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when showing properties of models with controlled reversibility it has proved sen- 
sible to first consider the underlying models where all transitions are reversible, 
and then study how control mechanisms change the picture [11,20]. The present 
work helps with the first step. 


Definition 2.3. Given (Proc, Lab, —), let the reverse LTS be (Proc, Lab, ~>), 
where P œ> Q iff Q & P. It is convenient to combine the two LTSs (forward and 
reverse): let the reverse labels be Lab = {a : a € Lab}, and define the combined 
LTS to be — C Proc x (Lab U Lab) x Proc by PQ iff PSQ and P 5 Q iff 
PSQ. 

We stipulate that the union LabULab is disjoint. We let a,... range over LabULab. 
For a € Lab U Lab, the underlying action label und(q@) is defined as und(a) = a 
and und(a) = a. Let a = a for a € Lab. Given t: P Q, let t: Q + P be the 
transition which reverses t. 

We let p,o,... range over finite sequences a1 ... Qn, with €p representing the 
empty sequence starting and ending at P. We shall write € when P is understood. 
Given an LTS, a path is a sequence of forward or reverse transitions of the form 
Py S Pi- S Pa. We let r,s,... range over paths. We may write r : P = Q 
where the intermediate states are understood. On occasion we may refer to a 
path simply by its sequence of labels p. Given a path r : P +. Q, the inverse 
path isr: Q oe P where £ = £ and ap = pa. The length of a path r (notated 
|r|) is the number of transitions in the path. Paths r : P 4, Q and R -, S are 
coinitial if P = R and cofinal if Q = S. We say that a path is forward-only if it 
contains no reverse transitions. 

Let (Proc, Lab, +) be an LTS. The irreversible processes in (Proc, Lab, +) are 
Irr = {P € Proc: P #}. A rooted path is a path r : P +, Q such that P € Irr. 

In the following we will consider LTSIs obtained by adding a notion of inde- 
pendence to combined LTSs as above. We will call the result a combined LTSI. 


3 Basic Properties 


In this section we show that most of the properties in the reversibility literature 
(see, e.g., [6,29,18,20]), in particular the parabolic lemma and causal consistency, 
can be proved under minimal assumptions on the combined LTSI under analysis. 

We formalise the minimal assumptions using three axioms, described below. 


Definition 3.1 (Basic axioms). Let L = (Proc, Lab, >,1) be a combined LTSI. 
We say L satisfies: 


Square Property (SP) if whenever t: P Q,u:P Ê, R with tı u then 


there are cofinal transitions u’ : Q £ Sandt: RS S; 
Backward Transitions are Independent (BTI) if whenever t : P  Q and 


tP SQ! andt Æt thentit’; 
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Well-Foundedness (WF) if there is no infinite reverse computation, i.e. we do 
not have P; (not necessarily distinct) such that Pj14 S P, for alli =0,1,.... 


WF can alternatively be formulated using backward transitions, but the current 
formulation makes sense also in non-reversible calculi (e.g., CCS), which can be 
used as a comparison. Let us discuss the intuition behind these axioms. SP takes 
its name from the Square Lemma, where it is proved for concrete calculi and 
languages in [6,18,20], and captures the idea that independent transitions can 
be executed in any order, that is they form commuting diamonds. SP can be 
seen as a sanity check on the chosen notion of independence. BTI generalises 
the key notion of backward determinism used in sequential reversibility (see, 
e.g., [32] for finite state automata and [35] for the imperative language Janus) 
to a concurrent setting. Backward determinism can be spelled as “two coinitial 
backward transitions do coincide”. This can be generalised to “two coinitial 
backward transitions are independent”. Finally, WF means that we consider 
systems which have a finite past. That is, we consider systems starting from 
some initial state and then moving forward and back. 

Axioms SP and BTI are related to properties which are part of the definition 
of (occurrence) transition systems with independence in [33, Definitions 3.7, 4.1]. 
WF was used as an axiom in [28]. 

Using the minimal assumptions above we can prove relevant results from the 
literature. We first define causal equivalence, equating computations differing 
only for swaps of independent transitions and simplification of a transition with 
its reverse. 


Definition 3.2 (cf. [6]). Let (Proc,Lab,—,v) be an LTSI satisfying SP. Let 
= be the smallest equivalence relation on paths closed under composition and 
satisfying: 


1. ift: PSQ,u: PÊ Rare independent, and u' : Q 2 S, t: RSS (which 
exist by SP) then tu’ ~ ut’; 
2.ttxe and ttre. 


We first consider the Parabolic Lemma ([6, Lemma 10]), which states that 
each path is causal equivalent to a backward path followed by a forward path. 


Definition 3.3. Parabolic Lemma (PL): for any path r there are forward- 
only paths s,s’ such that r = ss’ and |s| +|s’| < |r|. 


Proposition 3.4. Suppose an LTSI satisfies BTI and SP. Then PL holds. 


The proof of Proposition 3.4 (available in [16]) is very similar to that of [6, 
Lemma 10] except that in the latter BTI is shown directly as part of the proof. 

A corollary of PL is that if a process is reachable from an irreversible process, 
then it is also forwards reachable from it. In other words, making a system 
reversible does not introduce new reachable states but only allows one to explore 
differently forwards reachable states. This is relevant in reversible debugging of 
concurrent systems [10,20], where one wants to find bugs that actually occur in 
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forward-only computations. See the companion technical report [16, Corollary 
A.1]. We now move to causal consistency [6, Theorem 1]. 


Definition 3.5. Causal Consistency (CC): ifr and s are coinitial and cofi- 
nal then r X s. 


Essentially, causal consistency states that history information allows one to 
distinguish computations which are not causal equivalent, indeed, if two compu- 
tations are cofinal, that is they reach the same final state (which includes the 
stored history information) then they need to be causal equivalent. 

Causal consistency frequently includes the other direction, namely that coini- 
tial causal equivalent computations are cofinal, meaning that there is no way to 
distinguish causal equivalent computations. This second direction follows easily 
from the definition of causal equivalence. 

Notably, our proof of CC below is very much shorter than existing proofs. 


Proposition 3.6. Suppose an LTSI satisfies WF and PL. Then CC holds. 


Proof. Let r : P 4, Q and r’ : P 4, Q. Using WF, let I,s be such that 
s : I 5, P, I € Irr. Now srsr’ is a path from I to I, and so by PL there are 
r1,T2 forward-only such that ryrg ~ srsr’. But I € Irr and so rı = € and rg = £. 
Thus £ ~ srsr’, so that sr ~ sr’ and r ~ r’ as required. 


Causal consistency implies the unique transition property. 
Definition 3.7. An LTSI (Proc, Lab, —>, +) satisfies Unique Transition (UT) 
if P+ Q and P ÈQ imply a = b. 


Corollary 3.8. If an LTSI satisfies CC then it satisfies UT. 


UT was shown in the forward-only setting of occurrence TSIs in [33, Corol- 
lary 4.4]; it was taken as an axiom in [28]. 
Example 3.9 (PL alone does not imply WF or CC). Consider the LTSI with 
states P; for i = 0,1,... and transitions t; : Pi+1 5 P, ui: Fixi 2, P; with 
a # b and t; t u;. BTI and SP hold. Hence PL holds by Proposition 3.4. However 
clearly WF fails. Also t; and u; are coinitial and cofinal, and a Æ b, so that UT 
fails, and hence CC fails using Corollary 3.8. Note that the ab diamonds here 
have the same side states so are degenerate (cf. Lemma 4.4). 


4 Causal Safety and Causal Liveness 


In the literature, causal consistent reversibility is frequently informally described 
by saying that “a transition can be undone if and only if each of its consequences, 
if any, has been undone”. In this section we study this property, where the 
two implications will be referred to as causal safety and causal liveness. We 
provide three different versions of such properties, based on independence of 
transitions (Section 4.2), ordering of events (Section 4.3), and independence 
of events (Section 4.4), and study their relationships. In order to define such 
properties we need the concept of event. 
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4.1 Events 


Definition 4.1 (Event, general definition). Let (Proc, Lab, -,v) be an LTSI. 
Let ~ be the smallest equivalence relation satisfying: ift: P Q, u: P f, R, 


w:QÊS, U:RO3S,andtru, uct’, t'iu, w ıt, and 


— Q#R ifa and B are both forwards or both backwards; 
— P#S otherwise; 


then t~ t. The equivalence classes of forward transitions, written |P,a, Q], are 
the events. The equivalence classes of reverse transitions, written |P,a, Q], are 
the reverse events. Define a labelling function £ from — / ~ to Lab by setting 


e((P, a, Q]) =a. 


Events are introduced as a derived notion in an LTS with independence in [33], 
in the context of forward-only computation. We have changed their definition by 
using coinitial independence at all corners of the diamond, yielding rotational 
symmetry. This reflects our view that forward and backward transitions have 
equal status. 

Our definition can be simplified if the LTSI, and independence in particular, 
are well-behaved. Thus, we now add a further axiom related to independence. 


Definition 4.2 (Coinitial Propagation of Independence (CPI)). Ift: 
PQ, u: PÊR, w:Q4s and t!: R S with tvu, then u’ ct. 


CPI states that independence is a property of commuting diamonds more 
than of their specific pairs of edges. Indeed, it allows independence to propagate 
around a commuting diamond. 


Definition 4.3. If a combined LTSI satisfies axioms SP, BTI, WF and CPI, 
we say that it is pre-reversible. 


The name ‘pre-reversible’ indicates that we expect to require further axioms, 
but the present four are enough to ensure that LTSIs are well-behaved, with 
events compatible with causal equivalence. Pre-reversible axioms are separated 
from further axioms by a dashed line in Table 1. 

The following non-degeneracy property was shown for occurrence transition 
systems with independence in [33, page 312], which have forward transitions 
only. We have to cope with backwards as well as forward transitions. 


Lemma 4.4. Suppose that an LTSI is pre-reversible. If we have a diamond t : 


PSQ,u:P £ R with t ı u together with cofinal transitions u' : Q L S and 
t: RS S, then the diamond is non-degenerate, meaning that P,Q, R,S are 
distinct states. 


If an LTSI is pre-reversible then by Lemma 4.4 and the use of CPI we can 
simplify the statement of Definition 4.1 to: 
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Definition 4.5 (Event, simplified definition). Let (Proc, Lab, -,v) be a pre- 
reversible LTSI. Let ~ be the smallest equivalence relation satisfying: if t: P > 


Q, u: PÊR, w:Q48, t:RSS, andtcu, thent~ t. 


We are now able to show independence of diamonds (ID), which can be seen 
as dual of SP. 


Definition 4.6 (Independence of Diamonds (ID)). An LTSI satisfies the 
Independence of Diamonds property (ID) if whenever we have a diamond t : 
PQ, u: PÊR, w:Q4S andt: RSS, with 

— QF R ifa and B are both forwards or both backwards; 

— P#S otherwise; 


then teu. 


Proposition 4.7. If an LTSI satisfies BTI and CPI then it satisfies ID. 


We now consider the interaction between events and causal equivalence. We 
need some notation first. 


Definition 4.8. Let r be a path in an LTSI L and let e be an event of L. Let 
t(r,e) be the number of occurrences of transitions t in r such that t € e, minus 
the number of occurrences of transitions t in r such that t € e. 


We now show that {(r,e) is invariant under causal equivalent traces. 


Lemma 4.9. Let L be a pre-reversible LTSI. Let r ~ s. Then for each event e 
we have that f(r, e) = f(s, e). 


Lemma 4.9 generalises what was shown for the forward-only setting in [33, 
Corollary 4.3]. 


Proposition 4.10. If an LTSI is pre-reversible, then for any rooted path r and 
any forward event e we have f(r,e) > 0. 


4.2 CS and CL via Independence of Transitions 


We first define causal safety and liveness using the independence relation. 


Definition 4.11. Let L = (Proc, Lab, —>, 1) be a pre-reversible LTSI. 


1. We say that L is causally safe (CS) if whenever P Q, r: Q $, R, 
t(r, [P,a,Q]) = 0 and S & R with (P,a, Q) ~ (S,a, R), then (P,a, Q) ut for 
allt in r such that t(r, [t]) > 0. 

2. We say that L is causally live (CL) if whenever P Q, r: Q £, R and 
#(r,[P,a,Q]) = 0 and (P,a, Q) ct, for allt in r such that £(r, |t]) > 0, then 
we have S & R with (P,a, Q) ~ (S,a, R). 
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We may wish to close the independence relation over this axiom: 


Definition 4.12 (Independence Respects Events (IRE)). Whenever t ~ 
t u we have teu. 


IRE is one of the conditions in the definition of transition systems with inde- 
pendence [33, Definition 3.7]. Together with the axioms for pre-reversibility, it 
is enough to show both causal safety and causal liveness. 


Theorem 4.13. Let a pre-reversible LTSI satisfy IRE. Then it satisfies CS. 


Theorem 4.14. Let a pre-reversible LTSI satisfy IRE. Then it satisfies CL. 


CS and CL are not derivable from CC; we give an example LTSI which 
satisfies CC but not CS and not CL. 


Example 4.15. Consider the LTS in Figure 1. Independence is mostly coinitial 
and given by closing under BTI and CPI. Additionally we make the leftmost a- 
transition independent with all b-transitions. Note that all a-transitions belong 
to the same event, and all b-transitions belong to the same event. Also SP and 
WEF hold, so that the LTSI is pre-reversible, and CC holds. However IRE does 
not hold. Furthermore CS fails using Definition 4.11. Indeed, consider any path 
2o from the start. CS would imply that the first b is independent with the a 
but this is not the case (we do have bv a). 


Also CL fails using Definition 4.11. Indeed, consider any path = from the 
start. Since the leftmost a-transition is independent with all b-transitions, we 
should be able to reverse a at the end of the path, but this is not possible. 


The next axiom states that independence is fully determined by its restriction 
to coinitial transitions. This is related to axiom (E) of [33, page 325], but here 
we allow reverse as well as forward transitions. 


Definition 4.16 (Independence of Events is Coinitial (IEC)). If tı v t2 
then there are t} ~ tı, th ~ t2 such that t}, and t are coinitial and t} 1 th. 


Thanks to previous axioms, independence behaves well w.r.t. reversing. 


Definition 4.17 (Reversing Preserves Independence (RPI)). ftat 
then tut’. 
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Proposition 4.18. If an LTSI satisfies SP, CPI, IRE, IEC then it also satisfies 
RPI. 


All the axioms that we have introduced are independent, i.e. none is derivable 
from the remaining axioms. 


Proposition 4.19. SP, BTI, WF, CPI, IRE, IEC are independent of each other. 


4.3 CS and CL via Ordering of Events 


To define CS and CL via ordering of events, we define the causality relation < 
on events. 


Definition 4.20. Let L = (Proc,Lab,—,v) be an LTSI. Let e,e’ be events of L. 
Lete < e' iff for all rooted paths r, if t(r,e’) > 0 then f(r, e) > 0. As usuale < e’ 
means e < e' ande#e’. Ife <e’ we say that e is a cause of e. 


Lemma 4.21. If an LTSI satisfies SP, BTI, WF and CPI then < is a partial 
ordering on events. 


Previously, orderings on events have been defined using forward-only rooted 
paths; in fact, the definitions coincide for pre-reversible LTSIs. 


Definition 4.22 ([12,28]). Let L = (Proc, Lab, —,:) be an LTSI. Let e,e' be 
events of L. Let e <¢ e iff for all rooted forward-only paths r, if r contains a 
representative of e' then r also contains a representative of e. 


Lemma 4.23. For any LTSI, e < e implies e <;¢ e’. If an LTST satisfies SP, 
BTI, WF and CPI then e <; e’ implies e < e’. 


Proof. Straightforward using PL and Lemma 4.9. 


We now give definitions of causal safety and causal liveness using ordering 
on events. 


Definition 4.24. Let L = (Proc, Lab, >, 1) be an LTSI. 


1. We say that L is ordered causally safe (CS) if whenever P S Q, r: Q £, 
R, t(r, [P, a, Q]) = 0 and S & R with (P,a, Q) ~ (S,a, R), then |P, a, Q] £ e 
for all e such that t(r,e’) > 0. 

2. We say that £ is ordered causally live (CLz) if whenever P S Q, r: Q Æ, R 
and f(r, [P,a,Q]) = 0 and [P,a,Q] £ e for all e' such that t(r,e’) > 0 then 
we have S È R with (P,a,Q) ~ (S,a, R). 


We postpone giving proofs of CS. and CLe until we have introduced a further 
definition of causal safety and liveness using independence of events. 
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4.4 CS and CL via Independent Events 


We now introduce a third version of causal safety and liveness, which uses inde- 
pendence like CS and CL, but on events rather than on transitions. First we lift 
independence from transitions to events. 


Definition 4.25 (Coinitially independent events). Let events e, e’ be (coini- 
tially) independent, written e ci e’, iff there are coinitial transitions t,t’ such that 
jt] =e, t] =e andtut’. 


Lemma 4.26. If an LTSI is pre-reversible, then if e ci e’ we have also e ci e’. 


Thus in pre-reversible LTSIs, ci is fully determined just considering forward 
events. By Lemma 4.26, if we know e ci e’ then we know und(e) ci und(e’). 
We can give a third formulation of causal safety and liveness using ci: 


Definition 4.27. Let L = (Proc,Lab,—,v) be a pre-reversible LTSI. 


1. We say that L is coinitially causally safe (CSa) if whenever P & Q, r : 
Q =, R, t(r,[P,a,Q]) = 0 and S & R with (P,a,Q) ~ (S,a,R), then 
[P,a, Q] ci e for all forward events e such that t(r,e) > 0. 

2. We say that L is coinitially causally live (CLa) if whenever P > Q, r : 
Q =, R and t(r,[P,a,Q]) = 0 and [P,a,Q] ci e, for all forward events e 
such that t(r,e) > 0, then we have S  R with (P,a,Q) ~ (S,a, R). 


Note that in Definition 4.27 we operate at the level of events, rather than at the 
level of transitions as in Definition 4.11. 


Theorem 4.28. If an LTSI is pre-reversible then it satisfies CSa. 
We now introduce a weaker version of axiom IRE (Definition 4.12). 


Definition 4.29 (Coinitial IRE (CIRE)). Jf [t] ci [u] and t,u are coinitial 
then tı u. 


Theorem 4.30. If a pre-reversible LTSI satisfies CIRE then it satisfies CL,. 


We next give an example where CC holds but not CS, (and not CPI). 


Example 4.31. Consider the cube with transitions a, b,c on the left in Figure 2, 
where the forward direction is from left to right. We add independence as given 
by BTI. So SP, BTI, WF hold, but not CPI. From the start we have an a- 
transition followed by a path r = bc followed by a. For CS, to hold, we want 
a to be the reverse of the same event as the first a. They are connected by a 
ladder with sides cb. We add independence for all corners on the two faces of 
the ladder (ab and ac). Then we get bc ~ cb (independence at a single corner is 
enough). However the bs are not the same event since the bc face does not have 
independence at each corner. Therefore we do not get [a] ci [b], and CSa fails. 


We next give an example where CS,; and CL, hold but not CC. 
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Fig. 2. 


Example 4.32. Consider the LTSI with Q;  P;, P1 & P;, Qui & Qi, Pin S 
Qi for i =0,1,.... This is shown on the right in Figure 2. Clearly WF does not 
hold. We add coinitial independence to make BTI and CPI hold. Then also 


SP and CIRE hold. However, CC fails since, for example P), Qo 2, P) and 
P, & P are coinitial and cofinal but not causally equivalent. Note that there 
are just three events a,b,c with a cic, b ci c but not a ci b. CSa; and CL, hold. 
Indeed, c is independent from every other action, and it can always be undone, 
while a and b are independent from c only and they can be undone after any 
path composed by c and no others. 


4.5 Polychotomy 


In this section we relate our three versions of causal safety and liveness, with 
the help of what we call polychotomy, which states that if events do not cause 
each other and are not in conflict, then they must be independent. We start by 
defining a conflict relation on events. 


Definition 4.33. Two forward events e,e’ are in conflict, written e # e, if 
there is no rooted path r such that ġ(r,e) > 0 and f(r, e’) > 0. 


Much as for orderings, conflict on events has been defined previously using 
forward-only rooted paths [12,28]; in fact, the definitions coincide for pre-reversible 
LTSIs. We omit the details. 


Definition 4.34 (Polychotomy). Let £L be a pre-reversible LTSI. We say that 
L satisfies polychotomy if whenever e,e’ are forward events, then exactly one of 
the following holds: 1.e=e'; 2e<eé; 8e<e; f.ce#e;o0r Secie. 


Property NRE below is related to polychotomy. 


Definition 4.35 (No Repeated Events (NRE)). In any rooted path r, for 
any forward event e we have §(r,e) < 1. 


Lemma 4.36 (Polychotomy). Suppose that a pre-reversible LTSI satisfies 
NRE. Then polychotomy holds. 
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Example 4.87. Consider the LTSI in Figure 3. We add independence to make 
BTI and CPI hold. Both SP and WF hold. Hence, CC holds as well. There are 
three events, labelled with a, b,c. Clearly NRE fails for both a and b. We see that 
a < c but also a ci c, so that polychotomy fails. C54 holds by Theorem 4.28. 
However CS fails: consider the transition P & Q together with the path r : 


Q = Rand S & R, and note that a < c. 


The next lemma allows us to connect ordered safety and liveness with coinitial 
safety and liveness. 
Lemma 4.38. Suppose that a pre-reversible LTSI satisfies NRE. Suppose P & 


Q e=[P,a,Q], 7: Q +, R and #(r,e') > 0 where e' is a forward event. Then 
exactly one of ecie’ ande < e holds. 


Proposition 4.39. Suppose that a pre-reversible LTSI L satisfies NRE. Then 


1. L satisfies CS. 
2. L satisfies CLa iff L satisfies CLe. 


Property RED below is also related to NRE and polychotomy. 


Definition 4.40. An LTSI satisfies Reverse Event Determinism (RED) 
if whenever t,t’ are backward coinitial transitions and t ~ t then t =t. 


Proposition 4.41. [fa LTSI CL is pre-reversible then the following are equiva- 
lent: 1. L satisfies NRE; 2. L satisfies RED; 3. independence ci is irreflexive on 
events; and 4. polychotomy holds. 


Proposition 4.42. Suppose that a pre-reversible LTSI satisfies CIRE. Then tt 
also satisfies NRE. 


NRE was shown in the forward-only setting of occurrence transition systems 
with independence in [33, Corollary 4.6]. It was also shown in the reversible 
setting without independence in [28, Proposition 2.10]. 


Example 4.48. Consider the LTSI in Figure 4. Independence is given by closing 
under BTI and CPI. There are three events, labelled a,b,c, which are all inde- 
pendent of each other. We see that NRE holds but not CIRE. Also CL, and 


CLe fail: consider P & Q $ R, where a cannot be reversed at R. 
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Proposition 4.44. Let L be a pre-reversible LTSI. 


1. If IEC holds then CLa implies CL. 
2. If IEC and NRE hold then CL< implies CL. 


5 Coinitial Independence 


In this section we consider coinitial LTSIs, defined as follows, and their relation- 
ship with LTSIs in general. 


Definition 5.1. Let £L = (Proc, Lab, ,1) be a combined LTSI. Then ı is coini- 
tial if for all transitions t,u, if tı u then t and u are coinitial. We say that L is 
coinitial if ı is coinitial. 


We define a mapping c restricting general independence to coinitial transi- 
tions and a mapping g extending independence along events. 


Definition 5.2. Given an LTSI (Proc, Lab, —>, 1), definet g(t) wifft~tiu'nwu 
for some t',u'. Furthermore, define t c(t) u iff tı u and t,u are coinitial. 


Proposition 5.3. Let L = (Proc, Lab, —>, +) be a pre-reversible LTSI. 


1. If L is coinitial and satisfies CIRE then L' = (Proc, Lab, >, g(v)) is a pre- 
reversible LTSI and satisfies IRE and IEC. 

2. if L satisfies IRE then L' = (Proc, Lab, >, c(v)) is a pre-reversible coinitial 
LTSI and satisfies CIRE. 


Thanks to Proposition 5.3, we can extend a coinitial pre-reversible LTSI sat- 
isfying CIRE in a canonical way to a pre-reversible LTSI satisfying IRE and 
IEC. 

In some reversible calculi (such as RCCS) independence of coinitial transi- 
tions is defined purely by reference to the labels. If this is the case it is a simple 
matter to verify the axioms CPI and CIRE. 


Proposition 5.4. Let L = (Proc,Lab,—,v) be a coinitial combined LTSI. Sup- 
pose that I is a binary relation on Lab such that for any coinitial transitions 
t:PSQandu: P Ê, R we have t u iff I(a,b), where a and b are the 
underlying labels a = und(a), b = und(@). Then £ satisfies CPI and CIRE. 
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Proof. Straightforward, noting that labels on opposite sides of a diamond of 
transitions must be equal. 


Note that J must be irreflexive, since uz is irreflexive. 

If we have a coinitial pre-reversible LTSI satisfying CIRE then CS< and CL< 
hold (using Proposition 4.42 and Proposition 4.39). Applying mapping g we get 
a general pre-reversible LTSI satisfying IRE and IEC by Proposition 5.3. This 
will satisfy CS and CL as a result of applying Theorem 4.13 and Theorem 4.14 
respectively. It will also satisfy CS< and CLe. Conversely, if we have a general 
pre-reversible LTSI satisfying IRE then CS and CL hold by Theorem 4.13 and 
Theorem 4.14 respectively. Applying mapping c we get a coinitial pre-reversible 
LTSI satisfying CIRE. This will satisfy CS< and CLe . 


6 Case Studies 


We look at whether our axioms hold in various reversible formalisms. Remark- 
ably, all the works below provide proofs of the Loop Lemma. 


RCCS We consider here the semantics of RCCS in [6], and restrict the attention 


to coherent processes [6, Definition 2]. In RCCS, transitions P at Q and P ng 
Q’ are concurrent if uN u’ = Ø [6, Definition 7]. This allows us to define coinitial 
independence as t ų u iff t and u are concurrent. We now argue that the resulting 
coinitial LTSI is pre-reversible and also satisfies CIRE. SP was shown in [6, 
Lemma 8]. BTI was shown in the proof of [6, Lemma 10]. WF is straightforward, 
noting that backward transitions decrease memory size. Hence, we obtain a very 
much simplified proof of CC. For CPI and CIRE we note that independence 
is defined on the underlying labels and thus Proposition 5.4 applies. Therefore 
CS< and Cle hold. Using Proposition 5.3, we can get an LTSI with general 
independence satisfying IRE and IEC, and therefore CS and CL. This is the 
first time these causal properties have been proved for RCCS. 


HOr We consider here the uncontrolled reversible semantics for HOr [18]. We 
restrict our attention to reachable processes, called there consistent. The seman- 
tics is a reduction semantics; hence there are no labels (or, equivalently, all the 
labels coincide). To have more informative labels we can consider the transitions 
defined in [18, Section 3.1], where labels are composed of memory information 
and a flag denoting whether the transition is forward or backward. The notion 
of independence would be given by the concurrency relation on coinitial tran- 
sitions [18, Definition 9]. All pre-reversible LTSI axioms hold, as well as CIRE 
which is needed for causal safety and liveness. Specifically, SP is proved in [18, 
Lemma 9]. BTI holds since distinct memories have disjoint sets of keys [18, Def- 
inition 3 and Lemma 3] and by the definition of concurrency [18, Definition 9]. 
WF holds as each backward step consumes a memory, which is finite to start 
with. Finally, CPI and CIRE are valid since the notion of concurrency is defined 
on the annotated labels and using our Proposition 5.4. 
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As a result we obtain a very much simplified proof of CC. Moreover, using 
CPI and CIRE, we get the CS< and CLe safety and liveness properties and, ap- 
plying mapping g from Section 5, we get a general pre-reversible LTSI satisfying 
IRE and IEC, hence CS and CL are satisfied. This is the first time that causal 
properties have been shown for HOr. 


Ra We consider the (uncontrolled) reversible semantics for a-calculus defined 
in [5]. We restrict the attention to reachable processes. The semantics is an LTS 
semantics. Independence is given as concurrency which is defined for consecutive 
transitions [5, Definition 4.1]. CC holds [5, Theorem 4.5]. 

Our results are not directly applicable to Ra, since SP holds up to label 
equivalence of transitions on opposite sides of the diamond, rather than equality 
of labels as in our approach. We would need to extend axiom SP and the defi- 
nition of causal equivalence to allow for label equivalence in order to handle Ra 
using our axiomatic method. 


Erlang We consider the uncontrolled reversible (reduction) semantics for Er- 
lang in [20]. We restrict our attention to reachable processes. In order to have 
more informative labels we can consider the annotations defined in [20, Sec- 
tion 4.1]. We then can define coinitial transitions to be independent if they are 
concurrent [20, Definition 12]. 

We next discuss the validity of our axioms in reversible Erlang. SP is proved 
in [20, Lemma 13] and BTI is trivial from the definition of concurrency [20, 
Definition 12]. WF holds since the pairs of integers (total number of elements in 
memories, total number of messages queued) ordered under lexicographic order 
are always positive and decrease at each backward step. Intuitively, each step but 
the ones derived using the rule for reverse sched (see [20, Figure 11]) consumes 
an item of memory, and each step derived using rule reverse sched removes a 
message from a process queue. Finally, CPI and CIRE hold since the notion of 
concurrency is defined on the annotated labels, and by Proposition 5.4. 

Since this the setting is very similar to the one of HOr (both calculi have a 
reduction semantics and a coinitial notion of independence defined on enriched 
labels), we get the same results as for HOr, including CC, and CS and CL. 


Reversible occurrence nets Reversible occurrence nets [25,24] are traditional 
occurrence nets (safe and with no backward conflicts) extended with a reverse 
transition for each forward transition. They give rise to an LTS where states are 
pairis (V,m) with N a net and m a marking. A computation that represents 
firing a transition t in (N, m) and resulting in (N, m’) is given by a firing relation 
(N,m) Z, (N, m’). The notion of independence is the concurrency relation [25, 
Section 3] which is defined between arbitrary firings (transitions). Hence, we 
get a general LTSI. The CC property is shown by following the traditional ap- 
proach in [6]. SP and PL are shown as well. PL and CC require several pages of 
proofs [24]. The causal safety and causal liveness properties are not considered 
in [25,24]. 

We can obtain CC, and additionally CS and CL, as follows. SP and BTI 
are proved for reversible occurrence nets in [24] as Lemma 4.3 and Lemma 3.3 
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respectively. WF holds because there are no forward cycles of firings in occur- 
rence nets, hence no infinite reverse paths. In order to have CS and CL, we need 
to show CPI and IRE. Lemma 3.4 in [24] gives CPI. Events can be defined on 
firings as in our Definition 4.5, and then IRE holds as the concurrency relation 
preserves such events. 


7 Conclusion, Related and Future Work 


The literature on causal-consistent reversibility (see, for example the early sur- 
vey [19]) has a number of proofs of results such as the parabolic lemma (PL) and 
the causal consistency property (CC), all of which are instantiated to a specific 
calculus, language or formalism. We have taken here a complementary approach, 
analysing the properties of interest in an abstract and language-independent 
setting. In particular, we have shown how to prove the most relevant of these 
properties from a small number of axioms. 

Our approach builds upon [28], where a set of axioms for reverse LTSs was 
given and several interesting properties were shown. While the idea is similar, 
the development is rather different since we consider more basic axioms (we 
only share WF, while many of the axioms in [28], such as UT, follow from 
ours), and since the two papers focus on different properties. We focus on CC 
and various forms of CS and CL, while [28] considers correspondence with prime 
event structures and reversible bisimulations. Moreover, LTSs in [28] do not have 
a notion of independence. 

In other related work, we may particularly mention [8], which like ours takes 
an abstract view, though based on category theory. However, its results concern 
irreversible actions, and do not provide insights in our setting, where all actions 
are reversible. The only other work which takes a general perspective is [3], which 
concentrates on how to derive a reversible extension of a given formalism. How- 
ever, proofs concern a limited number of properties (essentially our CC), and 
hold only for extensions built using the technique proposed there. Also [27,29] 
are general, since they propose how to reverse a calculus that can be defined in 
a general format of SOS rules. However, the format has its syntactic constraints 
while our approach abstracts from them. Finally, [9] presents a number of prop- 
erties such as, for example, backward confluence, which arise in the context of 
reversing of steps of executed transitions in Place/Transition nets. 

The approach proposed in this paper opens a number of new possibilities. 
Firstly, when devising a new reversible formalism, our results provide a rich tool- 
box to prove (or disprove) relevant properties in a simple way. This is particularly 
relevant since causal-consistent reversibility is getting applied to more and more 
complex languages, such as Erlang [20], where direct proofs become cumbersome 
and error-prone. Secondly, our abstract proofs are relatively easy to formalise 
in a proof-assistant, which is even more relevant given that this will certify the 
correctness of the results for many possible instances. Another possible extension 
of our work concerns integrating into our framework irreversible actions [7]. In 
order to do that we could take inspiration from the above-mentioned [8]. 
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Abstract. We describe a set of simple features that are sufficient in order to make 
the satisfiability problem of logics interpreted on trees TOWER-hard. We exhibit 
these features through an Auxiliary Logic on Trees (ALT), a modal logic that essen- 
tially deals with reachability of a fixed node inside a forest and features modalities 
from sabotage modal logic to reason on submodels. After showing that ALT ad- 
mits a TOWER-complete satisfiability problem, we prove that this logic is captured 
by four other logics that were independently found to be TOWER-complete: two- 
variables separation logic, quantified computation tree logic, modal logic of heaps 
and modal separation logic. As a by-product of establishing these connections, we 
discover strict fragments of these logics that are still non-elementary. 


1 Introduction 


In mathematical logic there is a well-known trade-off between expressive power and 
complexity, where weaker languages cannot capture interesting properties of complex 
systems, whereas finding solutions of a given problem is infeasible for richer languages. 
For instance, many verification tasks, such as reachability and homomorphisms queries, 
happen to be expressible in monadic second-order logic (MSO) [15]. This logic is however 
not usable in practice, as its satisfiability problem SAT(MSO) is undecidable in general 
and was famously proved by Rabin to be decidable but non-elementary when the 
logic is interpreted on trees or on one unary function. A more recent analysis that uses the 
hierarchy of non-elementary ranking functions classifies SAT(MSO) on these two 
structures as TOWER-complete, i.e. complete for the class of problems of time complexity 
bounded by a tower of exponentials, whose height is an elementary function of the input. 

In order to bypass these problems, a general approach is to design restrictions of MSO 
that can solve complex reasoning tasks while being more appealing complexity-wise. An 
example of this is given by the framework of temporal logics, formalisms that describe 
the evolution of reactive systems [24]. Among the various temporal logics, from the 
classical linear temporal logic (LTL) and computation tree logic (CTL) [I3], as well 
as their fragments [2[33], to the more recently developed interval temporal logics |78], 
the main common feature of this framework is perhaps the ability to check whether the 
system can evolve to a certain configuration, i.e. a reachability query. In this context, 
we recall the landmark result on the satisfiability of CTL, shown EXPTIME-complete 
by Fisher and Ladner [23]. Another possibility to deal with the complexity of MSO 
is to restrict the second-order quantifications to specific submodels. This is the idea 
behind ambient logic [16], separation logic and more generally bunched logics 
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and graphs logics |I]. These logics provide primitives for reasoning about resource 
composition, mainly by adding a spatial conjunction œ * yw which requires to split 
a model into two disjoint pieces, one satisfying ọ and the other satisfying w. Similar 
ideas are developed in sabotage modal logics, where the formula @ ¢, headed by the 
sabotage modality @, states that ø must hold in a graph obtained by removing one 
edge from the current model [4/21]. Within these logics, we highlight the quantifier-free 
fragment of separation logic restricted to the « operator, denoted here with SL(*) and 
whose satisfiability problem is proved to be PSPACE-complete in [I2]. 

Once a framework provides a solid foundation for reasoning tasks, a natural step is to 
extend its expressiveness while keeping its complexity in check. Sometimes the additional 
capabilities do not change the complexity of the logic, as for example SL(*) extended 
with reachability predicates, whose satisfiability problem is still PSPACE-complete [20]. 
However, it often happens that the new features make the problem jump to higher 
complexity classes and, sometimes, reach MSO. We pinpoint two instances of this: 

— SL(*) enriched with first-order quantifiers, albeit less expressive than MSO inter- 

preted on one unary function, has a TOWER-complete satisfiability problem [9]. 

— CTL enriched with propositional quantifiers has an undecidable satisfiability problem 

on general models. On trees (i.e. QCTL’), the problem is TOWER-complete [28]. 
Consequently, it is natural to ask ourselves why the additional features made the problem 
harder. Answering this question requires to study the interplays between the various 
operators of the logic, searching for a sufficient set of conditions explaining its complexity. 


Our motivation. Second-order features often lead to logics with TOWER-hard satisfia- 
bility problems, as illustrated above for first-order SL(*) and QCTL’. A good amount of 
research has been done independently on these logics [5[9]17/28], culminating with the 
TOWER-hardness of SL(*) with two quantified variables and the TOWER-hardness of 
QCTL with just one temporal operator between exists-finally EF and exists-next EX 
(see Section 4] for the definitions). Connections between these two formalisms have not 
been explicitly developed so far, perhaps because of the quite different logics: QCTU 
is built on top of propositional calculus and it is interpreted on infinite trees, whereas 
SL(*) does not feature propositional symbols and it is essentially interpreted on finite 
structures. Nevertheless, we argue that these and other logics are related not only as they 
are fragments of MSO, but also as they share a form of reachability and an ability of 
reasoning on submodels which is sufficient to obtain TOWER-hard logics. 


Our contribution. We explicit these common features that lead to TOWER-hard logics 
by relying on an Auxiliary Logic on Trees (ALT), introduced in Section[2| ALT reasons 
about reachability of a fixed target node inside a finite forest and features modalities from 
sabotage logics to reason on submodels. Here, reachability should be understood as the 
ability to reach the target node in at least one step, starting from a “current” node which 
can be updated thanks to the existential modality somewhere (U) [26]. In Section3| we 
take a look at the expressive power of ALT and show that SAT(ALT) is TOWER-hard. In 
Section|4] we then display how ALT is captured by first-order SL(*) and QCTU, as well 
as modal logic of heaps (MLH) and modal separation logics (MSL), two other logics 
introduced in and [I8], respectively. In this context, beside exposing that all these 
logics are TOWER-hard because of the way they reason about reachability and submodels, 
we discover interesting sublogics that are still TOWER-complete: 
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— QCTL restricted to E(@ U yw) modalities, where pg, y are Boolean combinations of 
atomic propositions, or to the EF modality, which can be nested at most once. 

— the common fragment of MLH and MSL having Boolean connectives and the modal- 
ities (), (U) and «. Notice that this logic does not have propositional symbols. 


2 The definition of an Auxiliary Logic on Trees 


We introduce an Auxiliary Logic on Trees (ALT). Its formulae are from the grammar: 


9:=TIGAGPI-ITIGI U9 11 He 

As we will soon clarify, the symbol (U) is borrowed from Goranko and Passy paper 
on modal logic with universal modality [26]. Similarly, readers who are familiar with 
sabotage modal logics will recognise in @ the sabotage modality [4], and in its Kleene 
closure (i.e. @ applied an arbitrary number of times). These two operators modify the 
model during the evaluation of a formula, making ALT a relation-changing modal logic 
(following the terminology used in [3]). However, contrary to most modal logics, ALT 
does not feature classical propositional symbols. Instead, this logic only features two 
interpreted atomic propositions T and G. Roughly speaking, T stands for “the target node is 
reachable” whereas G stands for “the target node is not reachable”. The formal definitions 
will be given soon in order to clarify these two sentences. 

Let N be countably infinite set of nodes. A (finite) forest F : N43,N is a partial 
function (encoding the standard parent relation) that 


— has a finite domain of definition, i.e. dom(F) # {n E€ M | F(n) is defined} is finite; 
— is acyclic, i.e. for every n € dom(F) and ô > 1, F°(n) £n. 


Here, F denotes 6 > 0 functional composition(s) of F. Albeit non-standard, our defini- 
tion of finite forests over an infinite set of nodes simplifies the forthcoming definitions. 
Besides, in Section|3.2|we show how restricting M to a finite set does not change the 
expressive power nor the complexity of ALT. 
We denote the image of F as ran(F)= {n’ | F(n) =n’ for some n € dom(F)}. Given 

a finite set X, we denote with |X| its cardinality. Let n,n’ be two nodes. As usual, n 
is a F-descendant of n’ (alternatively, n’ is an F-ancestor of n) whenever F°(n) = n’ 
for some 6 > 1. In this case, if 6 = 1 then n is a F-child of n’ (alternatively, n’ is the 
F-parent of n). We drop the prefix F- from these terms when it is clear from the context. 
Given two forests F, F’, we say that F’ is a subforest of F , written F’ E F, whenever 
F(n) = F'(n) for every n € dom(F’). Figure [intuitively represents two forests (every 
“o” represents a node), the one on the left being a subforest of the one on the right. 

ALT is interpreted on pointed forests (F,t, n), where F is a forest and t,n € M are 
respectively called the target node and the current evaluation node. The satisfaction 
relation F is defined (throughout the paper, we omit standard clauses for T, A, =~) as: 


(F,t,n) ET 
(F,t,n) EG = n E dom(F) and (F,t,n) FT. 

(F,t,n)E(U)o & thereisn’ E N s.t. (F,t,n’) E 9. 

(F.t nE ep & thereis F’ s.t. F! CF, |\dom(F’)|+1 = |dom(F)|, (F’, t,n) E @. 
(F,t,n) E “p & there is F’ s.t. F’ E F and (F’,t,n) E ọ. 


& nisa F-descendant of t. 


def 
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IM 


Fig. 1. Subforest relation 


We denote with L the contradiction ~T. The standard connectives V and > are defined 
as usual. The semantics of T and G is pretty straightforward. As a visual aid, the nodes in 
Figure[I]satisfying T are the ones in the dark grey area, whereas the ones in the light grey 
area satisfy G. As stated before, the semantics given to (U) @ is the one of the existential 
modality somewhere |26], stating that there is a way to change the current evaluation 
node so that g becomes true. Its dual operator [U] o {= (U) ~g is the universal modality 
everywhere. The semantics given to @ @ is the one of the sabotage modality from [4], 
which requires to find one edge of the forest that, when removed, makes the model satisfy 
g. Lastly, the i modality, here called repeated sabotage operator, can be seen as the 
operator obtained by applying @ an arbitrary number of times. Indeed, by inductively 
defining + ọ (k € N) as the formula ọ for k = 0 and otherwise (k > 1) as @ ar ọ, it 
is easy to see that (F, t,n) F $“ ọ is equivalent to 3k € N. (F, t, n) F + Q. 

Given a pointed forest (F, t, n), we denote with F (G), the set of its garbage nodes: 
the set of elements in dom(F) that are not descendants of t, i.e. F(G), & {n € dom(F) | 
V6 > 1, F%(n) Æ t}. Then, F(G), is equivalent to {n E M | (F,t,n) E G}. We omit 
the subscript t from F (G), when it is clear from the context. We augment the standard 
precedence rules of propositional logic so that the modalities (U), @ and @* have the 
same precedence as ~. For instance, the formula (U) T A G should be read as ((U) T) A G. 


Satisfiability problem. As usual, given a logic & and one of its interpretations F on 
a class of structures ©, the satisfiability problem of 2, denoted with SAT(2) when the 
interpretation is clear from the context, takes as input a formula ø of & and asks whether 
there is a structure Wè E€ © such that W F ø. If the answer is positive, then ø is satisfiable. 


Where does ALT come from? A preliminary definition of ALT was introduced in 
to reason on the complexity of separation logic. As such, in ALT features the separat- 
ing conjunction ọ * y from separation logic, stating that the forest can be partitioned into 
two disjoint subforests, one satisfying g and one satisfying w. This binary operator gener- 
alises both © and * operators (we show how in Section 4). Hence, the TOWER-hardness 
of the satisfiability problem for the logic defined here cannot be inherited from 
and must be proved (Section). Unfortunately, the proof does not give any indication 
on whether or not the two versions of ALT have the same expressive power. What is 
clear is that the two logics analyse the model in a different way: the * operator is able to 
reason on the model in a “concurrent” way, whereas @ and i do it in a “sequential” one. 
Let us draw an example of this. Let (F, t, n) be a pointed forest. We aim at defining a 
formula #ch,,, 22 stating that the target node t has at least two children. First, we define 
#ch,,, 21 (the formula for just one child) as (U)(T A + $6). Intuitively, #ch,,, > 2 
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can then be defined with the * operator simply as the formula #ch,,, 21 * #chy,, 21, 
stating that it is possible to partition the forest into two subforests having both at least 
one child of t. With the @ operator, this property is instead defined as 


#chirg 22 # (U) (TAT @GA $C inDomA #ch,,, >1)). 


where inDom Tv G states that the current evaluation node is in the domain of the for- 
est. This definition of #ch,,, 22 requires to find one child of t (as encoded by the 
“(U)(TA7 9G ^- -” part of the formula) and remove it from the model (as expressed 
by the “@(- inDom ^- - ” part). Only afterwards we check for the existence of a second 
child of t. This form of “sequential reasoning” (that can be often avoided when using 
the * operator), is used in almost all the formulae of the next sections: we first find a 
node satisfying a certain property, we remove it from the structure, and only afterwards 
we check if the model satisfy a second property. This principle only works well for 
monotonic properties: with respect to the definition of #ch,,, 22, the set of children of t 
monotonically decreases when considering subforests. Thus, finding a child of t in the 
subforest, implies finding a child of t in the original forest. 


3 On the complexity and expressive power of ALT 


In this section, we show that SAT(ALT) is TOWER-hard by reduction from the satisfiability 
problem of Propositional Interval Temporal Logic on finite words (Section 3.3). The 
proof adapts the arguments used in for the version of ALT featuring the separating 
conjunction *. The reduction is somewhat non-intuitive and in itis given without 
explaining why more direct ways fail. Here, we clarify this issue which is related to 
the fact that ALT cannot deduce any property of the portion of a pointed forest (F, t, n) 
corresponding to the nodes in F(G), except for the size of F(G) and the query n € F(G). 
This is done in Section [3.2| by relying on a notion of Ehrenfeucht-Frassé games for ALT. 


3.1 Towards the TOWER-hardness of SAT(ALT): how to encode finite words. 


As a first step, we define a correspondence between finite words and specific pointed 
forests. As usual, we define the set of finite words on a finite alphabet È as the closure 
under Kleene star =*. To ease our modelling, we suppose = = [1, n] to be the alphabet 
of natural numbers between 1 and n. Let w = aj---a, be a k-symbols word in 2* and 
M = {nj,---,n,} be a set of k nodes. Let N, (i € [1, k]) be a set of a; + 1 nodes different 
from n,,---,n, and so that for each distinct i, j € [1, k], N; AN; = Ø. Lastly, let t be a node 
not in M U Uien. N;. A pointed forest (F, t, n) encodes w w.r.t. the sets M, N1, ++, Ng 
iff 1) F (np) = t, (ID for all i€[1, k — 1] F(n;) = n;+1, AID for all i€[1, k] and n'EN;, 
F(n’) = n; and (IV) every F-descendant of t belongs to a set among M, N}, +, Ng. 

We call the path from n, to ng, the main path of F. The nodes of this path are the 
ones in M, and can be characterised as being the only descendants of t with at least 
one child. Moreover, n, is the only node of the main path having the same number of 
descendants and children. We say that a node n € dom(F) encodes the symbol a € È if 
it is a descendant of t and it has exactly a+ 1 children that are not in M. Then, the nodes 
in M are the only ones encoding symbols, where n; encodes a; for any i € [1, k]. For 
instance, Figure [2]shows an encoding of the word 1121. 
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Formula Intended meaning 


1 1 2 1 
size(@) >J [FO] 2 A. 
7 #desc >f  (F,t,n)F T and n has at least J descendants. 
n n n3 na t 


#child > p (F,t,n)F T andn has at least p children. 


Fig. 2. Encoding of 1121. Table 1. Formulae and their meaning on (F, t, n). 


In order to characterise the class of pointed forests encoding finite words, we adapt 
the formulae of shown in Table[I] (where their semantics is described). Let (F, t, n) 
be a pointed forest and let p € N. The formula size(G) > p is inductively defined as: 
size(G)>0 = T, size(G) > f+1 & (U) (G A $C inDom A size(G) > P). 
Notice how, in the definition of size(G) > +1, we use the same principle used to encode 
#ch,,, 22 at the end of eae y first find a node in F (G), remove it from the model, 
and then find other p elements of F(G). The formulae #desc > p and #child > £ (again, 
we refer to Table [i] for their semantics) are instead defined as: 


#desc > B = Q*( [U] -GATA $C inDomA size(G) > p) ) 
s 


a 
F(G) is empty. Removing n lead to a set of garbage nodes of size at least f. 

#child > 0 # T, #child > p+1 & #desc > P+lA ~ag’ A a#desc > 1). 

—— eenennenenneee/ 


Whenever f nodes of dom(F’) are removed, if n still reaches t then it has at least one descendant. 


Given s € {size(G), #ch,,,, #desc, #child}, we writes = J for s > p Ans > f+. 
For instance, #child = f is the formula that checks whether n has exactly p children 
and it is a descendant of t. We can now conclude the encoding of finite words. 

Let (F, t, n) be a pointed forest encoding w € 2* and let M be the set of nodes in its 
main path. Let us recall two properties of our encoding: (I) a node n’ encodes a symbol 
of w iff n’ € M, and (II) the node encoding the first symbol of w is the only node in M 
with the same number of descendants and children. To reflect (1) we denote with symb 
the formula #desc > 1. For (ID, given S C 2, we introduce the formula 1st, that checks 
if the current evaluation node corresponds to the first node of the main path and encodes 
a symbol in S. It is defined as Vpes(#desc = fP+1A#child = f + 1). The following 
statement formalises the connection between this formula and property (ID stated above. 


Lemma 1. Let w E X*. Let (F,t,n) be a pointed forest encoding w. Let n, be the first 
node in the main path of F. For every S C Z, (F,t,n) F (U) ists iff (F,t,n,) F ists. 
We are finally ready to define the formula wordy, characterising the class of forests that 


encodes words in 2*. It is proved correct by Lemma[2| and it is defined as follows 


The target node has no descendants, or has a descendant that encodes a symbol. 


m 
wordy £((U)T > (U) symb) Av#chy,, > 2A 
[U](symb > 1sty V (> 1stin41} A @ 1stz)). 


— mm 


The current node encodes a symbol in [1, n] and exactly one of its children encodes a symbol. 


Lemma 2. A pointed forest (Ft, n) is an encoding of a word in X* iff (F ,t,n) F wordy. 
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game played on ((F,,t,,1n,), (F2, ty, Ny), (m, s, k)) 
if there is p € {G, T} s.t. not ((F,t,,n,) E p iff (75, t, n,) F p) then the spoiler wins, otherwise 
the spoiler chooses i € {1,2} and plays on (F;, t;, n;). The duplicator replies on (F;, t;,n;) where 
j € {1,2}\{i}. The spoiler must choose one of the following moves (else the duplicator wins). 
(U) move: if m > 1 then the spoiler can choose to play a (U) move. It selects anode n! € M. 
— Then, the duplicator must reply with some node n’ € M (otherwise the spoiler wins). 
— The game continues on ((F;, ti, n{ ), (F2, t2; n3), (m—1, s, k)). 
@ move: if s > 1 and dom(F,) # Ø then the spoiler can choose to play a @ move. It selects 
a finite forest F’ such that F’ C F, and |dom(F!)| = |dom(F;)| — 1. 
— The duplicator must reply with some F EF; s.t. |dom(F)| = |dom(F,)| — 1. 
— The game continues on ((F/, t;,n,),(F3, ty, My), (m, s—1,k)). 
e move: if k > 1 then the spoiler can choose to play a . move. It selects a forest F’ C F;. 


— The duplicator must reply with some F s.t. Fi EF, 
— The game continues on ((F}, t, n), (F3, ty, My), (m, s, k—1)). 


Fig. 3. Ehrenfeucht-Fraïssé games for ALT 


3.2 Inexpressibility results via the Ehrenfeucht-Fraissé games for ALT 


Now that we are more familiar with the logic, before completing the TOWER-hardness 
proof of SAT(ALT) we show some properties that ALT cannot express. Notably, these 
properties explain why the TOWER-hardness proof of the next section cannot be easily 
simplified. Moreover, inexpressibility results effectively reduce the set of forests that 
must be considered in order to solve SAT (ALT). This in turn makes reductions from 
SAT(ALT) to other logics more immediate, as we show throughout Section [4] 

A standard way of proving inexpressibility results for logics interpreted on finite 
models is by adaptation of the Ehrenfeucht-Fraissé games [29], as done for other relation- 
changing logics such as context logic for trees and ambient logic [16]. 

We define the rank of a formula @ as the triple (m, s, k) € N? where the modal rank 
m is the greatest nesting depth of the modal operator (U) in ø, whereas the sabotage 
rank s (resp. repeated sabotage rank k) is the greatest nesting depth of the (resp. +) 
operator in ø. We denote with ALT(rk) the set of formulae with rank rk € N°. 

The Ehrenfeucht-Fraissé games (EF-games) for ALT are formally defined in Fig- 
ure B] A game is played by two players: the spoiler and the duplicator. A game state 
((Fy, ti; n1), (F2, t2, n2), rk) is a triple made of a rank rk and two pointed forests (F,, t4, n1) 
and (F>, t2, n2). The goal of the spoiler is to show that the two structures are different. 
The goal of the duplicator is to counter the spoiler and show that the two structures are 
similar. Let us make clear what we mean by two models being different: both players 
can only play following the rules of the logical formalism (in our case, ALT). Then, two 
models are different if and only if there is a formula ø € ALT (rk) that it is satisfied by 
only one of the two models. This correspondence between the game and the logic is 
expressed by an adequacy result, formalised below in Lemmaj3} 

A player has a winning strategy if it can play in a way that guarantees it the victory, 
regardless what the other player does. We write (F1, t1, n1) Srk (F2; t2, n2) whenever the 
duplicator has a winning strategy for the game ((F), ti, n1), (F2, ty, n2), rk). By Martin’s 
Theorem our games are determined: if the duplicator does not have a winning 
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strategy then spoiler has one, and vice-versa. Hence, (F1, t1, n1) Ærk (Fo, t2, N2) refers to 
the fact that the spoiler has a winning strategy. 


Lemma 3. (7), ti, n1) #4(F2, t2; n2) iff Jee ALT (rk), (Fy, ty, ny )F@ and (Fp, t2, no)¥—. 


The left-to-right direction is proved by induction on the rank and by cases on the first 
move that the spoiler makes in his winning strategy. The other direction is proved by 
structural induction on ø. We start to use the EF-games to derive three easy results. 


Lemma 4. Let ọ be a formula. 
1. ọ is satisfiable iff it is satisfiable by a pointed forest (F ,t, n) where t € dom(F). 
2. Given a forest F and nodest E€ N andn,n’ € dom(F), (F,t,n) E p iff (F,t,n’) E 9. 
3. If duplicator has a winning strategy for a game ((F,, ti, n1), (F2, t2, n2), rk) then it 
has a winning strategy where it always replies to (U) moves by selecting nodes in 
dom(F;) Uran(F;), for some i E {1,2}. 


Proof (sketch). We sketch the proof of (1) to show how EF-games are used. Let us 
consider a pointed forest (F, t, n) such that (F, t, n) F ø. We take a node t’ ¢ dom(F) U 
ran(F) and define the forest F’(n’) £ if F(n’) = t then t’ else F(n’). Notice that t’ ¢ 
dom(F’). We then prove Vrk E€ N? (F,t,n) + (F’,t’, n) by induction on rk, leading to 
(1) directly by Lemma[3| The proof of (3) essentially follows from (2). oO 


Interestingly enough, the third statement of Lemma[4]fundamentally implies that enforc- 
ing M to be finite, instead of infinite as we do throughout this work, does not change the 
expressive power nor the complexity of ALT. 

Let (F, t,n) be a pointed forest. We now show that ALT has a very limited expres- 
sive power with respect to the garbage nodes. In particular, it can only check for the 
membership of n in F(G) (with the formula G) and for the size of F(G) (with the formula 
size(G) >f). We formalise this inexpressibility result as follows. 


Lemma 5. Let rk = (m,s,k). Let F, Fı and F, be three forests and let n,t € N, such 
that for every i € {1,2}, F E F; and F,(G), = dom(F;) \ dom(F). If we have 

né€F,(G), ifn E€ FG), and min(|F)(G),|,m+s+k) = min(|F,(G),|,m +s +k) 
then (F1,t,n) Xk (Fo, t, n). 
Let us informally explain Lemma|5| whose proof is by induction on rk and by cases on 
the moves of the spoiler. Let (7), t,n) be a pointed forest and suppose (ad absurdum) 
that it satisfies a formula ø of rank rk that express a property of the garbage nodes that 
is different form the ones cited above. For example, let us assume that ø characterise 
the set of pointed forests having a garbage node with at least two children. Consider 
the subforest F E F, whose domain corresponds to the set of 7',-descendants of t. In 
particular, Fi (G), = dom(F,) \ dom(F). We extend F to a forest F, by (re)defining 
it on the nodes in F,(G), so that F,(G), = F,(G), and none of these nodes has more 
than one F,-child (this construction can always be done). This last equality implies that 
n E F,(G), @n E€ F,(G), and min(|F’;(G),|,m +s + k) = min(|F>(G),|,m + s + k). By 
Lemma [5|(F;, t,n) S% (F2,t,n), which implies (F3, t,n) F @ by Lemma J} However, 
(Fz, t, n) is defined so that every node in F>(G), has at most one child. Thus, g cannot 
characterise the set of models having a garbage node with at least two children. 

As shown in the next section, the inexpressibility result in Lemmaf5|plays a central 

role in the development of the reduction that leads to the TOWER-hardness of SAT(ALT). 
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3.3 PITL on marked words and the TOWER-hardness of SAT(ALT) 


We are now ready to show the non-elementarity of SAT(ALT). The proof is by reduction 
from the satisfiability problem of Propositional Interval Temporal Logic (PITL) under 
locality principle [34/25], which in turn is shown TOWER-hard by reduction from the non- 
emptiness problem of star-free regular languages (see for the TOWER characterisation 
of this problem). PITL is a well-known logic that was introduced by Moszkowski in 
for the verification of hardware components. It is interpreted on non-empty finite words 
over a finite alphabet of unary symbols È. Its formulae are from the grammar: 


P:=PAG|-~@lalillele 
where a € È. Under the locality principle interpretation, a word w = a,---a, E =t 
satisfies a whenever a, = a. Moreover, w satisfies 1 if it is a word of length one (i.e. 
w € X). The main feature of this logic is its chop operator “ |”. Intuitively, p | y is satisfied 
by words that can be “chopped” into a prefix and a suffix sharing one symbol, so that the 
prefix satisfies @ and the suffix satisfies y. Formally, 


aya, Eglw & there isi € [1, k] such that a,---a, F p and ap -ap F W- 


Translating | in ALT is not easy. Indeed, given the encoding of words proposed in Sec- 
tion[3. 1] chopping w in two pieces means splitting in some way the main path n4, ++, ny 
of a forest (F, t, n) encoding w to then check that the word encoded by nj,---,n; satisfies 
g and the one encoded by n;,,---, ng satisfies yw. However, by doing this the elements 
n,,°-:,n; become garbage nodes. Thus, as a consequence of Lemmaf5} ALT cannot check 
in any way what is the word encoded by these nodes. Easy translations from PITL to 
ALT seem therefore impossible and, as done in [BT], we are required to go through an 
alternative interpretation of PITL based on marking symbols instead of chopping words. 


A marking of an alphabet È is a bijection Ores: relating a symbol a € È to its 
marked variant & € È. We denote with X the extended alphabet £ w È. A word is marked 
if it has some symbols from È. We introduce the satisfaction relation F, on a marked 
word w € X *. It is defined as usual for Boolean connectives. Moreover, 


wk, a & w is headed by a or ā; wk, 1 & w is headed by a marked symbol. 


The definition of ø |w is more involved. Let w € X*, 3 € = and w” € D* be such that 
w = waw”, so that a is the first marked symbol occurring in w (this decomposition is 
uniquely defined). Then, waw” F, |w holds if and only if there is there is b € È s.t. 


(a) ww’ is the empty word, b =a and aw” F, pA y, or 

(b) there is w, E Z* s.t. w = bw,, bwaw” F — and bw,aw” F y, or 

(c) w is not the empty word, b = a, waw” F p and aw” E y, or 

(d) dw, € Z+, dw, € Z* s.t. w = wbw,, wbw amw” F p and bwaw” F w. 
On this semantics, the satisfaction of a formula only depends on the prefix a,---a;_) a; 
of w that ends with the first marked symbol. To check whether w F, g |w we search for 
a position j € [1, i] inside this prefix so that ¢ is satisfied by the word obtained from tp 
by marking the j-th symbol, whereas w is satisfied by the suffix of w starting in j. In 
the definition above, this idea is split into four cases (a)-(d), depending on truthiness 
of j = 1 and j =i. This is done as it better reflects the encoding of PITL in ALT. The 
semantics on marked words is related to the standard semantics of PITL as follows. 
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Proposition 1 (from [31]). Let w € Z*, a € X and w' € }*. Let ọ be a formula in 
PITL. wa satisfies p under the standard interpretation of PITL if and only if waw’ E, o. 


The alternative interpretation of PITL allows us to reduce SAT(PITL) to SAT(ALT) 
in a neat way. Let 2 = [1,7], } = ŁU > and let f: 2 > [1, 27] be the bijection f(a) S$ 2a 
for a € X and f(a) # 2a — 1 for a € È. f(a;---a,) denotes the word f(a,):--f(a,). f maps 
» into the alphabet [1,2], whose words can be encoded into trees (as in Section[3.1). 
In these trees each symbol a € È (resp. a € >) corresponds to a node in the main path 
having 2a + 1 (resp. 2a) children not in this path. Therefore, given a node n encoding a 
symbol in 2, removing exactly one children of n that is not in the main path is equivalent 
to marking the symbol n encodes. Based on this description, we can check if the current 
evaluation node encodes a marked symbol from > with the following formula: 


marks #\/ ez ((#child = 2a A 1st; 5,)) V (#child = 2a+ 1A 71sty 9,1) 


As already stated, w F, @ examines the prefix of w that ends with the first marked 
symbol. In pointed forests (F, t,n) encoding w, this prefix corresponds to the subtree 
whose root encodes a marked symbol and is a F-descendant of every other node encoding 
marked symbols. Therefore, to characterise this tree we need to track the number of nodes 
encoding marked symbols. We first define a formulamarks; > f stating that the forest has 
at least 6 € N nodes encoding marked symbols. It is defined as T for J = 0, and otherwise 
($ > 1) as (U) (marks A @(7inDomA marks; > p- 1)). Again, this formula uses the 
same principle introduced in Section|2|for #ch,,, 22: we search for a node encoding 
a marked symbol, remove it from the structure and then search for f—1 other such 
nodes. Similarly, we introduce #markAncs > B= £ symb A $c inDomAmarkss > f), 
the formula stating that the current evaluation node encodes a symbol and has at least p 
ancestors that encode marked symbols. 

At last, for a formula g in PITL having symbols from È = [1, n], we introduce its trans- 
lation V,(@) in ALT, where f > | tracks the number of nodes encoding marked symbols. It 
is homomorphic for Boolean connectives: Va(7@)E7V5(@) and Vp(pAw)= Vi(P)AVe(W). 
For a € 2 and 1, it faithfully represent the F, relation: V,(a) = = (U) ist [2a-1,2a] and 
V,(1) = = (U)(1st; 2,) \ marks). Lastly, the formula Va(Ply) is defined as 


(U)( symb a(d Sti] 27] A marky A Vye) A Ve(w))v 
(1sti,2n] ^m marky A @(marks A Vg41(p)) A Va(w))v 
(3 1st) 2n] Amarks A#markAncs > p — 1A Vo) A O(L8t 1129) A Va Q)V 
(3 Isti on] Aa marks A#markAncys = p A (marks AVp+1(P)) A t Xel Sti 2n] AV, (y))))) a 
Notice how V(G ly) follows closely the F, relation: it is split into four disjuncts, one for 
each case in the definition of ọ |w. For example, the second disjunct of Vi(@ |y) encodes 
the case (b) in the definition of w'aw” F, ply, as schematised below: 
PITL|4bex... dw) €X* s.t. w! = bw, and bwaw” F p and bwaw” Ey 
ALT | (U)(symb... 1stii 2n] ^m marky AM (marks AVp41 (p) AV) 


The translation is proved correct (by induction on the structure of @) in the next lemma. 


Lemma 6. Let = = [1,n] and È = Z UÈ. Let w € ¥* with p > 1 marked symbols. Let 
(Ft, n) be an encoding of (ww). For every ọ in PITL, w F, 9 iff (F,t,n) E Vp(¢). 
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Then, the reduction from SAT(PITL) on standard semantics follows as we are able to 
characterise the set of pointed forests encoding words in >"> (first three conjuncts in the 
formula of Lemmaf?). To conclude, we simply apply Lemma|6jand Proposition [I] 


Lemma 7. Every ọ in PITL written with symbols from È = [1, n] is satisfiable under the 
standard interpretation of PITL if and only if the following formula in ALT is satisfiable 


wordy, 2n] A (U)T A [U](marks eTA” $0) A Vi (p). 
The forest encodes a non-empty word. The only node encoding a marked symbol is the child of the target node. 


Because of the case distinction in the formula Vg (ø |y), the formula obtained via Vọ is 
exponential (hence elementary) in the number of symbols used to write ø. Therefore, 
from the TOWER-hardness of SAT(PITL) we conclude that SAT(ALT) is TOWER-hard. 


4 Revisiting TOWER-hard logics with ALT 


We now display the usefulness of ALT as a tool for proving the TOWER-hardness of 
logics interpreted on tree-like structures. In particular, we provide semantically faithful 
reductions from SAT(ALT) to the satisfiability problem of four logics that were indepen- 
dently found to be TOWER-complete: first-order separation logic [9], quantified CTL on 
trees [28], modal logic of heaps and modal separation logic [18]. Our reduction only 
use strict fragments of these formalisms, allowing us to draw some new results on these 
logics. Most notably, this section shows that all these logics are TOWER-hard because 
they fundamentally provide the reachability and submodel reasoning given by ALT. 


4.1 From ALT to First-Order Separation Logic 


Separation logic (SL) is an assertion logic used in state-of-the-art tools for 
Hoare-style verification of heap-manipulating programs. As already stated, a preliminary 
definition of ALT was defined in to reason on the complexity of separation logic. 
Hence, here we briefly revisit the relation between ALT and SL. 

Let VAR and LOC be two countably infinite sets of program variables and locations, 
respectively. Separation logic is interpreted on memory states: pairs (s, h) consisting 
of a function (the store) s: VAR-LOC and a partial function with finite domain (the 
heap) h: LOC-,;,LOC. Since M and LOC are both countably infinite sets, w.l.0.g. we 
assume LOC = WN. We extend the notation of domain, image and function composition 
to stores and heaps. Two heaps h, and h, are said to be disjoint, written h, Lh, whenever 
dom(h,) N dom(h,) = Ø, and when this holds the union h, + h, of h, and h, is defined 
as the standard sum of functions (h; + h)(¢) £ if Z Edom(h,) then h,(Z) else h (2). 
Let u € VAR be a fixed variable that is reserved for quantification (quantification over 
other variables is not possible). We consider the separation logic 1SL(*, alloc, >+), 
whose formulae are built from the following grammar (as in [BI]: 


9 :=T|eA@|r7@|emp|x=y|xoylallocx)|xo*ty|g@*@| dup 


where x,y € VAR. As shown below, the reachability predicate œ+ can be seen as the 
transitive closure of the standard points-to predicate © of separation logic. For a memory 
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state (s, h), the satisfaction relation F is defined as follows: 


def 


(s, h) F emp = dom(h) = Ø. (s,h)FxS y © h(s(x)) = s(y). 
(sh) Fx=y g s(x) = s(y). (s, h) F alloc(x) g s(x) E dom(h). 


def 


(s,h) Fx>ty & there is 46 > 1 such that h®(s(x)) = s(y). 
s,Nkoxry & Fh, hy s.t. hy Lh, hy + h, = h, (s, hi) F ọ and (s, hy) F y. 
(s, h) F Ju o & there is a location 2’ € LOC such that (s[ueZ2'], h) E o, 


where s[u+/"] is the store updated from s by only changing the evaluation of u from s(u) 
to @’, i.e. for every x € VAR, s[u7@’](x) = if x = u (syntactically) then 7’ else s(x). 
The main ingredient of separation logic is the separating conjunction @ * wy, that 
is satisfied whether h can be partitioned into h, and h, so that (s,h,) F œ whereas 
(s, h2) E y. The * operator captures the @ and operators as follows. Consider the 
formula size = 1S-emp A7(7emp * semp), which is satisfied whenever |dom(h)| = 1. 
We define 4.9 “(size = 1) * mand 0 “T x o. The semantics of these formulae 
is related to the analogous operators of ALT as follows: 


(s,h) E $p <> Ahy,hy st. hy Lay, hy + hy = h, |dom(h,)| = 1 and (s, Ay) Fg. 
(s.h) E OD > AN, Ay st. hy Lh, hy + hy = h and (s, hy) E Q. 


In order to perform the reduction from SAT(ALT) to SAT(1SL(*, alloc, @*)), we 
fix a variable x € VAR that is syntactically different from u and that plays the role of the 
target node. Then, the translation T (ø) of a formula g in ALT is straightforward: 


t (T) fuctx, C ON- AAO T,(T) ST. 
T(G) Zalloc(uArt,(T). T(P) £r p). TıP) = atp). 
Tx({U) p) = Ju 7,(9). T(P AW) = TP) A Tx(W). 


Given a pointed forest (F, t, n) and a store s such that s(x) = t and s(u) = n, by structural 
induction on @ we can easily show that (F,t,n) F o © (s, F) FE t (p). This, together 
with the fact that Vu ~(u @* u) characterises the class of acyclic heaps (which correspond 
to the forests of ALT), directly implies the following result. 


Lemma 8. Let xEVAR\ {u}. g in ALT and t, (p) A Vu ~(u S* u) are equisatisfiable. 


This lemma reproves that both 1SL(*, alloc, ©*) and first order separation logic with 
two quantified variables (denoted as 2SL(*)) admit a TOWER-hard satisfiability problem. 
2SL(*), as introduced in [17], can be defined from 1SL(*, alloc, œ+) by removing 
alloc and %* from the syntax and allowing a second variable, different from u, to be 
quantified. However, in the authors show that both alloc and +*+ are expressible in 
2SL(*), and with some very minor modification to their formulae we can show that both 
predicates are definable using @ and 4 instead of x and emp. Moreover, these logics are 
in TOWER by Rabin’s Theorem [36], leading to the TOWER-completeness of SAT(ALT). 


Theorem 1. SAT(2SL(«)) and SAT(1SL(*, alloc, @*)) are TOWER-complete even when 
emp and x are replaced with @,, and 6. SAT(ALT) is TOWER-complete. 


4.2 From ALT to Quantified Computation Tree Logic 


We now consider Computation Tree Logic (CTL), a well-known logic for branching 
time model checking [14113]. Among its extensions, in [5[22128] the addition of propo- 
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sitional quantification is considered. The satisfiability problem of the resulting logic is 
undecidable on Kripke structures, and TOWER-complete on trees [28]. In [5], the authors 
show that the problem is TOWER-hard even when considering just one operator among 
exists-next EX or exists-finally EF (the definitions are below). Here, we reprove the result 
for EF by first tackling the TOWER-hardness of the logic with the exists-until E(g@ U y), 
and then show that this operator can be defined using EF . Differently from [5] and thanks 
to the properties of ALT, our reduction does not imbricate until operators, showing that 
this extension of CTL remains TOWER-hard even when E(g U w) is restricted so that o 
and y are Boolean combinations of propositional symbols. 


Let us first recall the standard definition of Kripke structure [27]. Let AP & {p, q, +} 
be a countable set of propositional symbols. A Kripke structure is a triple (W, R, V) 
where W is a countable set of worlds, R C W x W is a left-total accessibility relation 
(left-total means that for each world w € W there is w’ € W s.t. (w,w’) € R) and 
V : AP > 2” isa labelling function. We define R(w) = {w’ € W | (w, w’) € R} as the 
set of worlds accessible from w E€ W. Let R C W x W be an arbitrary relation on worlds 
(not necessarily left-total). A path m starting in w is a sequence of worlds (Wo, w4»: ) 
such that wọ = w and (w;, W;41) E R for every two successive elements w,, w;,, Of the 
sequence. The path z is said to be maximal whenever it is not a strict prefix of any other 
path. We denote with IT, (w) the set of maximal paths starting in w. If R is left-total 
then IT, (w) is the set of all infinite paths starting in w. Lastly, R*(w) denotes the set of 
worlds reachable from w, i.e. those worlds belonging to a path in IIg (w). 


We consider Quantified Computational Tree Logic interpreted under tree semantics 
(QCTL’) and refer the reader to for a complete description of the logic. The formulae 
of QCTL are built from the following grammar: 


e:=T|1eAe|-@ |p| EXe| E@Ug) | Alp Ug) | Apo 
where p € AP. All temporal modalities of QCTU are from CTL: EX is the exists-next 
modality, E(g U wy) is the exists-until modality and A(g U w) is the all-until modality. 


QCTL is interpreted on Kripke trees. Formally, a Kripke structure (W, R, V) is a 
(finitely-branching) Kripke tree if (1) R7! is functional and acyclic, (II) for every world 
w E€ W, R(w) is finite and (IID it has a root, i.e. R*(r) = W for some r E€ W. Given 
w E W, the worlds in R*(w) \ {w} are said to be descendants of w. As Kripke structures 
are left-total, Kripke trees can be seen as finitely-branching infinite trees. This leads to 
SAT(QCTL) being in TOWER by reduction to MSO on trees [28]. Let K = (W, R, V) 
be a Kripke tree and w € W. The satisfaction relation F of QCTU is defined as: 

(K, w) E p S wevp. 
(K, w) F EX p & Iw € R(w) s.t. (K, w) E Q. 
(K,w) FE(p Uy) & there are (Wo, W1.°°) E I] g(w) and j € N such that 
(K, wj) F y and for every i < j, (K, w;) F ọ. 
(K,w) F Alp Uy) & forall (wo, W,.°-) € Ig(w), 3j € N such that 
(K, wj) F y and for every i < j, (K, w;) E ọ. 


(K,w) F Ape & there is W’ C W such that (W, R, Vip—Ww’]) E ọ, 


where, similarly to the store update s[u+/"] of the previous section, V[p-W’] stands 
for the function obtained from V by updating the evaluation of p from V(p) to W’. 
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The formula 3p ọ requires to update the satisfaction of p in a way such that ¢ is 
satisfied. This should already give a good clue on how to reduce ALT to QCTU: we 
represent the nodes of a forest as the set of worlds satisfying a propositional symbol D. 
Then, for instance, the repeated sabotage operator is encoded by using an existential 
AE that changes the evaluation of a propositional symbol £ so that it only holds in worlds 
where D holds. In this way, the set of worlds satisfying E represents a subforest of the 
original one. The universal quantification Y and the connectives => and V are defined as 
usual. So are the classical temporal operators from [14], exists-finally EF p * E(T U 9), 
all-generally AGg © 7EF 7g, all-finally AF g = A(T U @), exists-generally EGo = 
~AF 7g, and exists-strong-release (op My) = E(pU@Ay). 

We now work towards a formal encoding of a pointed forest (F, t, n) into a pointed 
model (K, w), where K = (W, R, V) is a Kripke tree and w is one of its worlds. We use w 
to play the role of the target node t. To encode the forest F and the current evaluation node 
n we use the worlds appearing in R*(w) and three propositional symbols: D, end and 
n. The intended use of D is to state which elements of R*(w) encode nodes in dom(F). 
We need to be careful here, as R*(w) is an infinite set whereas dom(F) is finite. We 
use the propositional symbol end to solve this inconsistency: we constraint K to satisfy 
the formula AF (end) stating that every maximal path (Wo, w1»: ) E€ IIR(w) has a finite 
prefix (Wos ++, Wwj-1) (j € N) of worlds not satisfying end, whereas wee V(end). Then, 
a world in W encodes an element in dom(F) whenever it satisfies D and it belongs to one 
of these prefixes. We use the propositional symbol n to encode the current evaluation 
node. During the translation we require n to be satisfied by exactly one descendant of w, 
so that the modality (U) roughly becomes a quantification over n. From [28], checking 
whether a formula @ holds in exactly one descendant of w can be done with the formula 
uniq(g) Ž EF (p) A Vp (EF (p A p) > AG(@ > p)) where p € AP does not appear 
in @. For technical reasons, we treat in a similar way the world w, which encodes the 
target node, and require it to be the only world (among the ones in R*(w)) satisfying the 
auxiliary propositional symbol t. Lastly, we use an additional propositional symbol £ 
in order to encode subforests and deal with the encoding of ¢ and (as stated above). 


We now formalise the encoding. For the remaining of this section, we fix a tuple 
X &(end,n, t) of three different propositional symbols. Let D be an additional symbol 
notin X, and let (F, t, n) be a pointed forest s.t. t £ dom(F ) (by Lemmal4{1) it is sufficient 
to consider this class of structures in order to decide satisfiability of a formula in ALT). A 
pointed model (K = (W, R, V), w), is an (X, D)-encoding of (F , t, n), or simply encoding 
when (X, D) is clear from the context, if there is an injection f: V+R*(w) s.t. 
1. f(t)“ wis the only world in ran(f)N V(t ), and f(n) is the only world in ran(f)N V(n); 
2. for every n’ € dom(F) it holds that (f(F(n’)), f(n’)) € R; 
3. for every infinite path (Wo, w1: ) E€ IIg (w) there is i > 0 s.t. w; E V(end) and 
- Vj € [0,i — 1], w; ¢ V(end) and (w; € V(D) & An’ € dom(F) f(n’) = w;); 
— for every j > i and every node n’ € dom(F), f(n’) # wj. 
It is easy to show that such an encoding always exists. Informally, the first property states 
that w encodes t and is the only world in R*(w) satisfying t. Similarly, the world f(n) 
encoding n is the only world in R*(w) that satisfies n. The second property states that 
the forest must be correctly encoded in the Kripke structure. In particular, notice that the 
parent relation of the finite forest is inverted so that it becomes the child relation in the 
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Fig. 4. A pointed forest (left) and one of its encoding as a finitely-branching Kripke tree (right). 


Kripke structure (as shown in Figure[4}. As f is an injection, the encoding does not merge 
together trees that are disconnected in the forest. Lastly, the third property of f states that 
the elements in dom(F) must be encoded by nodes in R*(w) that precede every world 
satisfying end. Moreover, among all the descendants of w preceding end, the worlds 
encoding dom(F’) are the only ones satisfying D. This implies that w does not satisfy D 
(as t € dom(F)). Figure[4]shows a pointed forest and one of its possible encodings. 

We now formalise the translation. Fix two different symbols D, E not in X. In order 
to alternate between D and E, we define D “ E and E @ D. The translation T,(p) of a 
formula g in ALT, implicitly parametrised by X and where u € {D, E }, is homomorphic 
for T and Boolean connectives (as in T,,, see Section|4. Ip, and otherwise it is defined as 


T (T) SE((uv t)A7end)M(uAn)). t (U) p) Ž In (uniq(n) A T,(@)). 
T(G) SE(nend M (u^ n) Ant, (T). T,(@' 9) £AU(AGG> u) A TP). 
T,(@ p) = Ju (AG (> u) A uniqlu a 7%) A E(nend M (uA 7%)) A 7(@)). 


Let (F, t, n) be a pointed forest s.t. t € dom(F) and let (W, R, V), w) be one of its 
(X, u)-encodings w.r.t. the injection f. For instance, t„(T) requires that there is a path 
(w, w1» ++, w;) Starting in f(t) = w and whose worlds do not satisfy end and must satisfy u 
or t. Moreover, the last world w; must satisfy u and n . From property (1) of the definition 
of f, the only element satisfying t is w, which does not satisfy u (as t ¢ dom(F)). Then, 
this path of worlds encodes a path in the pointed forest, from the current evaluation node 
n (which is encoded by the only world satisfying n ) to the target node t. The translation is 


shown correct (by structural induction on ø) for pointed forests that admit an encoding. 


Lemma 9. Let (F,t, n) be a pointed forest s.t. t £ dom(F), and let (K, w) be a (X, u)- 
encoding of (F, t, n). Given a formula ọ in ALT, (F,t,n) F @ ifand only if (K, w) F t (p). 


Then, to conclude the reduction we just need to characterise the set of models encoding a 
def 


pointed forest. The formula enc =7=DAt Auniq(t)A unigq(n) A AF(end) does the job. 
Lemma 10. p in ALT and enc AT ,(@) in QCTU are equisatisfiable. 


We now take a closer look to the translation. Given a temporal modality 7 and 
k ENU{o}, QCTL'(T*) denotes the fragment of QCTL’ restricted to formulae where the 
only temporal modality allowed is 7 , which can be nested at most k times (œ stands for an 
arbitrary number of imbrications). For instance, QCTL (EF k ) denotes the set of formulae 
restricted to the operator EF , which can be nested at most k times. This fragment of QCT’ 
is shown to be k-NEXPTIME-hard in [5], which directly leads to the TOWER-hardness 
of QCTL (EF ®) and QCTL(EU®). By analysing our translation it is easy to show that 
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QCTL'(EU®), i.e. QCTL restricted to the only modality E(g@ U y) where @ and y are 
Boolean combination of propositional symbols, and QCTL(EF') are already TOWER- 
hard. First of all, the formula E(g U w) in QCTL(EU®) is equivalent to the following 
formula in QCTL(EF!): Ap(AG Geary => p)AAG (p > AG p) AEF (y A7p)), where 
p does not appear in o or y. Then, we just need to prove the result for QCTL’(EU?). 
Clearly, the translation t, is defined so that the resulting formula is in QCTL(EU®). 
However, we need to deal with the occurrence of AF (end) used inside the formula enc. 
Let us first consider the formula AG (ø => AG y) which is satisfied by models where 
once ¢ is found to hold in a certain world w, then w is satisfied in every world of R*(w). 
Despite not being in QCTL (EU), the formula AG(g@ => AG w) is equivalent to the 
following formula: VpVq (unig(p) Auniq(q) \EF (pA @g) AEF (q ^ny) > E(~ap M q)), 
where p and q do not appear in ọ or y. We then define a formula yeg (@) that only uses 
EF modalities and is equivalent to EG g, so that then ~ yg (7@) is equivalent to AF ø: 


tec (P) £ Ap(=p A AG Og > p) AAG (p > AG p) A 
va (uniq(q) A EF (q A =p) => EF (q A EF (0q A =p)))) 


where p does not appear in ø. This formula is expressible in QCTL’(EU®), as every 
subformula that is not in this fragment is an instance of AG (pọ > AG w). Then, we 
conclude that AF (end) is expressible in QCTL(EU®), leading to the following result. 


Theorem 2. The satisfiability problems of QCTL'(EU°) and QCTL(EF!) are TOWER-c. 


4.3 From ALT to Modal Logic of Heaps and Modal Separation Logic 


In and later in two families of logics are presented, respectively called modal 
logic of heaps (MLH) and modal separation logic (MSL). At their core, both logics can be 
seen as modal logics extended with separating connectives, hence mixing separation logic 
(Section|4.1) with temporal aspects as in quantified CTL (Section|4.2). As we already 
shown how ALT is captured by these two latter logics, it is natural to ask ourselves if 
the same holds for MLH and MSL. In this section, we show that this is indeed the case 
and, as for the previous two sections, ALT allows us to refine the analysis on these logics. 
Both MLH and MSL are interpreted on finite Kripke functions. A finite Kripke function 
is a Kripke structure (W, R, V) (see Section[4.2]for its definition) where W is infinite 
and R, instead of being left-total, is finite and weakly functional, i.e. |R| € N and for 
every w, Ww’, w” E€ W, if (w, w’) E€ R and (w, w”) E€ R then w’ = w”. As N and W are 
both countably infinite sets, without loss of generality we assume W = M. Two Kripke 
structures K; = (W, R4, V) and K, = (W, R3, V) are disjoint if R; NR, = Ø. When 
this holds, K, + K, denotes the model (W, R4 U R3, V). To shorten the presentation, 
in the following diagram we introduce a language having the operators from MSL and 
MLH, and summarise known and new results on these logics (where p € AP): 


MSL: TowER-complete from [T8]. MLH: TowER-complete from [17]. 


P= p | (p |T| prg |ol oloy | (Up | Ole 
TOWER-hard by reduction from SAT (ALT), shown here. 
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As defined below, {) is the standard alethic modality from modal logic, 7! is its converse 
modality, and (#) is the elsewhere modality that generalises the somewhere modality 
(U) as (U) ọ = ọ V (+) @. For a pointed model (K, w), where K = (W, R, V) is a finite 
Kripke function and w € W, the satisfaction relation F is defined as follows: 

def 


(K,w) F p S&S we Vp). 

(K,w)E Op  & there is w’ € R(w) such that (K, w) F 9. 

(K,w)E Ole & there is w’ € W such that w € R(w’) and (K, w’) E 9. 

(K,w) (+) @ & there is w’ € W such that w’ ¥ w and (K, w’) E g. 

(K,w)E psy & (Ki, w) F oand (K>,w) F y for some Ki, K; s.t. Ky + Ky = K. 


By looking at the diagram above, compared to the work in [18], ALT allows us to show 
that propositional symbols and the elsewhere modality can be removed from MSL without 
changing the complexity status of its satisfiability problem. Similarly, ALT allows us to 
refine the analysis on the complexity of SAT(MLH) by showing that the =! modality is 
not needed in order to achieve non-elementary complexities. 


Let (F,t,n) be a pointed forest and let (X, w) be a pointed model where K = 
(W,R, V). For the reduction, we use w to encode the current node n. Encoding t 
is not so immediate, as MLH does not have propositional symbols. A possible so- 
lution is to encode it as a self-loop, so that the formula T is translated to a query 
stating that w reaches the self-loop. As done in Section [4.1] we define the formula 
size=l = (U) OT A7>((U) OT * (U) OT), that is satisfied whenever |R|=1. We also 
define the modalities @ and @* in MLH: @ g © (size=1) * p and @ pT * g. Lastly, 
we introduce the formula selfloop = 9(OOT A 7@,, @,,T) that is satisfied by (K, w) 
if (w, w) E€ R. Suppose for a moment that we are able to use this formula to characterise 
the class of of every finite Kripke function (W, R, V) where there is exactly one cycle, 
and this cycle is a self-loop on a world w,. Then, we use w, to encode the target node 
t of a finite forest (F, t, n) while being careful that the (3 and 4 operators of ALT are 
translated in such a way that the self-loop on w, is preserved. Because of the specific 
treatment of w,, it is convenient to assume that the current evaluation node n is encoded 
by a world different from w,, which reflects on the translation of (U). The admissibility 
of this assumption follows by Lemmaf] 


We encode pointed forests as finite Kripke functions. Let (F, t, n) be a pointed forest 
s.t. t € dom(F) and n # t. A finite Kripke function (NV, R, V), n) (recall, W = M) is 
an encoding of (F, t, n) iff for every n’, n” € N we have (n', n”) E R © (F(n’) = n” or 
n’ = n” = t). Notice how R is essentially defined from F by adding the self-loop (t, t). 
The translation t(ø) in MLH of a formula g in ALT is homomorphic for T and Boolean 
connectives (as is the case for T, in Section/4. Th, and otherwise it is defined as 


t(T) = A (OTAIUKOT > OOT)). (p) = ,(z(@) A (U) se1floop). 
t(G) SOT Ase). 7(@* p) = Y(c() A (U) selfloop). 
t((U) p) Ž (U)(48elf loop A 7(Q)). 


We highlight two points of this translation. First, 7(T) essentially asks to find a submodel 
where every path reaches the self-loop and the current evaluation node is in one of these 
paths. Second, notice how the translation of @ and checks that the model is updated 
so that the self-loop is not lost, as required by our encoding. It should be noted that 
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this requirement cannot be met if we were translating the definition of ALT from [31], 
featuring the « operator. Indeed, by partitioning the model into two pieces, this operator 
removes the self-loop from one of the two parts, breaking our encoding. The following 
lemma (proved by structural induction on ø) shows the correctness of our translation. 


Lemma 11. Let (F,t,n) be a pointed model s.t. n + t and t ¢ dom(F). Let (K, n) be an 
encoding of (F,t,n). Given a formula in ALT, (F,t,n) F o iff (K, n) FE r(@). 


To conclude the reduction we show that we can characterise the class of models encoding 
pointed forests, i.e. the finite Kripke functions with exactly one cycle, which is a self-loop. 
We first define the formula hascycl # @* ( (U) OT A[U](QT => OQT)) that checks if 
a finite Kripke function has at least one cycle. Then, the desired property can be simply 
defined by stating that there is a self-loop which, whenever removed, leads to an acyclic 
submodel: 1se1floop = (U) (selfloop A7@ (OL A hascycl)). 


Lemma 12. Every formula ọ in ALT is equisatisfiable with t(g) A 1selfloop. 


For the proof of Lemmal[I2| both Lemmal4{ 1) and (2) are used in order to restrict ourselves 
to pointed forest (F, t, n) s.t. n #tandt  dom(F). Then, we apply Lemma|I 1] 


Theorem 3. The fragment of MLH and MSL with Boolean operators, <) and (U) modal- 
ities, and * (alternatively, 4. and >) has a TOWER-complete satisfiability problem. 


5 Conclusions 


We studied an Auxiliary Logic on Trees (ALT), a quite simple formalism that admits a 
TOWER-complete satisfiability problem. ALT is shown to be easily captured by various 
non-elementary logics: first-order separation logic, quantified CTL, modal logic of heaps 
and modal separation logic. Through ALT, we were not only able to connect these logics, 
but also to refine their analysis and find strict fragments that are still TOWER-hard. Most 
importantly, with ALT we hope to have shown a set of simple and concrete properties, 
centred around reachability and submodel reasoning, that when put together lead to logics 
having a non-elementary satisfiability problem. 

This work leaves a few questions open. First, the fragments of ALT where @ or v are 
removed from the logic have not being studied yet. The logic without i is of particular 
interests, as it is connected with the sabotage logics from [4]. Second, the analysis done 
on first-order separation logic and on modal logic of heaps (Sections|4. I]and[4.3) reveals 
that the complexity of these logics does not change when the * operator and the emp 
predicate are replaced with the less general operators @ and Y. We find this point 
interesting, as from an overview of the literature, it seems that this result also holds 
for the separation logics considered in [9[17[19[30[31]. Moreover, for the logics whose 
expressiveness is known, i.e. the ones in [19/30], it seems that also the expressive power 
remains unchanged. However, we struggle to see how to uniformly express the operator * 
with @ and i as the resulting logics reason on the model in a different way (as as shown 
in Section 2). Lastly, this work illustrates the potential of ALT as a tool for proving the 
TOWER-hardness of logics interpreted on tree-like structures. As the operators of our 
logic are simple, we hope ALT to be useful to study logics with unknown complexities. 
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Abstract. In model checking, partial-order reduction (POR) is an ef- 
fective technique to reduce the size of the state space. Stubborn sets are 
an established variant of POR and have seen many applications over the 
past 31 years. One of the early works on stubborn sets shows that a com- 
bination of several conditions on the reduction is sufficient to preserve 
stutter-trace equivalence, making stubborn sets suitable for model check- 
ing of linear-time properties. In this paper, we identify a flaw in the rea- 
soning and show with a counter-example that stutter-trace equivalence 
is not necessarily preserved. We propose a solution together with an up- 
dated correctness proof. Furthermore, we analyse in which formalisms 
this problem may occur. The impact on practical implementations is 
limited, since they all compute a correct approximation of the theory. 


1 Introduction 


In formal methods, model checking is a technique to automatically decide the 
correctness of a system’s design. The many interleavings of concurrent processes 
can cause the state space to grow exponentially with the number of components, 
known as the state-space explosion problem. Partial-order reduction (POR) is 
one technique that can alleviate this problem. Several variants of POR exist, 
such as ample sets |11|, persistent set and stubborn sets |16|21). For each of 
those variants, sufficient conditions for preservation of stutter-trace equivalence 
have been identified. Since LTL without the next operator (LTL_x) is invariant 
under finite stuttering, this allows one to check most LTL properties under POR. 

However, the correctness proofs for these methods are intricate and not re- 
produced often. For stubborn sets, LTL_x-preserving conditions and an accom- 
panying correctness result were first presented in [15], and discussed in more 
detail in [17]. While trying to reproduce the proof for [I7] Theorem 2] (see also 
Theorem [1] in the current work), we ran into an issue while trying to prove a 
certain property of the construction used in the original proof [I7] Construction 
1]. This led us to discover that stutter-trace equivalence is not necessarily pre- 
served. We will refer to this as the inconsistent labelling problem. The essence 
of the problem is that POR in general, and the proofs in [I7] in particular, 
reason mostly about actions, which label the transitions. The only relevance of 
© The Author(s) 2020 
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the state labelling is that it determines which actions are visible. On the other 
hand, stutter-trace equivalence and the LTL semantics are purely based on state 
labels. The correctness proof in does not deal properly with this disparity. 
Further investigation shows that the same problem also occurs in two works of 
Beneš et al. [2J3], who apply ample sets to state/event LTL model checking. 

Consequently, any application of stubborn sets in LTL— x model checking 
is possibly unsound, both for safety and liveness properties. In literature, the 
correctness of several theories 9/10/18) relies on the incorrect theorem. 

Our contributions are as follows: 


— We prove the existence of the inconsistent labelling problem with a counter- 
example. This counter-example is valid for weak stubborn sets and, with a 
small modification, in a non-deterministic setting for strong stubborn sets. 

— We propose to strengthen one of the stubborn set conditions and show that 
this modification resolves the issue (Theorem [}. 

— We analyse in which circumstances the inconsistent labelling problem occurs 
and, based on the conclusions, discuss its impact on existing literature. This 
includes a thorough analysis of Petri nets and several different notions of 
invisible transitions and atomic propositions. 


Our investigation shows that probably all practical implementations of stubborn 
sets compute an approximation which resolves the inconsistent labelling problem. 
Furthermore, POR methods based on the standard independence relation, such 
as ample sets and persistent sets, are not affected. 

The rest of the paper is structured as follows. In Section [2| we introduce 
the basic concepts of stubborn sets and stutter-trace equivalence, which is not 
preserved in the counter-example of Section |3| A solution to the inconsistent 
labelling problem is discussed in Section |4| together with an updated correct- 
ness proof. Sections [5] and [6] discuss several settings in which correctness is not 
affected. Finally, Section [7| presents related work and Section [8] presents a con- 
clusion. 


2 Preliminaries 


Since LTL relies on state labels and POR relies on edge labels, we assume the 
existence of some fixed set of atomic propositions AP to label the states and 
a fixed set of edge labels Act, which we will call actions. Actions are typically 
denoted with the letter a. 


Definition 1. A labelled state transition system, short LSTS, is a directed 
graph TS = (S,->, 8, L), where: 


— S is the state space; 

— >C Sx Act x S is the transition relation; 

— §€S is the initial state; and 

— L: S — 24? is a function that labels states with atomic propositions. 
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We write s “5 t whenever (s,a,t) E€ —. A path is a (finite or infinite) alter- 
nating sequence of states and actions: so “4 s1 #3 s2.... We sometimes omit 
the intermediate and/or final states if they are clear from the context or not rel- 
evant, and write s 2-417, ¢ or s “t"", for finite paths and s 22- for infinite 


paths. Paths that start in the initial state § are called initial paths. Given a path 


T = so “5 sı £3 s2..., the trace of m is the sequence of state labels observed 
along 7, viz. L(s9)L(s1)L(s2).... An action a is enabled in a state s, notation 


s $, if and only if there is a transition s $ t for some t. In a given LSTS TS, 
enabled rs(s) is the set of all enabled actions in a state s. A set Z of invisible 
actions is chosen such that if (but not necessarily only if) a € Z, then for all 
states s and t, s “s t implies L(s) = L(t). Note that this definition allows the set 
T to be under-approximated. An action that is not invisible is called visible. We 
say TS is deterministic if and only if s & t and s $ t imply t = t’, for all states 
s, t and t and actions a. To indicate that TS is not necessarily deterministic, 
we say TS is non-deterministic. 


2.1 Stubborn sets 


In POR, reduction functions play a central role. A reduction function r : S > 
24¢t indicates which transitions to explore in each state. When starting at the 
initial state §, a reduction function induces a reduced LSTS as follows. 


Definition 2. Let TS = (S,—,8,L) be an LSTS and r : S => 24% a reduction 
function. Then the reduced LSTS induced by r is defined as TS, = (Sp, >r 
,5,L,), where Ly is the restriction of L on Sp, and Sy and >r are the smallest 
sets such that the following holds: 

— 8€ Sr; and 

— Ifs € S., s&t anda € r(s), then t € S, and s +s, t. 


Note that we have +, C —. In the remainder of this paper, we will assume 
the reduced LSTS is finite. This is essential for the correctness of the approach 
detailed below. In general, a reduction function is not guaranteed to preserve 
almost any property of an LSTS. Below, we list a number of conditions that have 
been proposed in literature; they aim to preserve LTL_ x. Here, we call an action 
aa key action in s iff for all paths s “~*", s’ such that a1 ¢ r(s),...,@n € r(s), 
it holds that s’ £. We typically denote key actions by dey. 


DO If enabled(s) #0, then r(s) N enabled(s) 4 0. 


D1 Forallaeér(s) anda; ¢ r(s),..., an € r(s), ifs S>--- s sn 4 s, then 
there are states s’,s),...,8/,_, such that s 5 5’ £4 s) £3 e En sl. 


D2 Every enabled action in r(s) is a key action in s. 
D2w If enabled(s) # 0, then r(s) contains a key action in s. 


V  Ifr(s) contains an enabled visible action, then it contains all visible ac- 
tions. 

I If an invisible action is enabled, then r(s) contains an invisible key action. 

L For every visible action a, every cycle in the reduced LSTS contains a 


state s such that a € r(s). 
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ai an ar an 
s —— S1 — ++ — Sn-1 ——> Sn s ——> $1 —::: — Sn-1 — > Sn 
r $ ai ri 1 an ! 
Sn S > Sı oe > Sn—1 > Sn 


Fig. 1: Visual representation of condition D1. 


These conditions are used to define strong and weak stubborn sets in the 
following way. 


Definition 3. A reduction function r : S + 24° is a strong stubborn set iff 
for all states s € S, the conditions DO, D1, D2, V, I, L all hold. 


Definition 4. A reduction function r : S + 24% is a weak stubborn set iff for 
all states s E€ S, the conditions D1, D2w, V, I, L all hold. 


Below, we also use ‘weak/strong stubborn set’ to refer to the set of actions 
r(s) in some state s. First, note that key actions are always enabled, by setting 
n = 0. Furthermore, a stubborn set can never introduce new deadlocks, either by 
DO or D2w. Condition D1 enforces that a key action akey € r(s) does not disable 
other paths that are not selected for the stubborn set. A visual representation 
of condition D1 can be found in Figure|1} When combined, D1 and D2w are 
sufficient conditions for preservation of deadlocks. Condition V enforces that the 
paths s #175, s! and s “4="*; s/ in D1 contain the same sequence of visible 
actions. The purpose of condition I is to preserve the possibility to perform 
an invisible action, if one is enabled. Finally, we have condition L to deal with 
the action-ignoring problem, which occurs when an action is never selected for 
the stubborn set and always ignored. Since we assume that the reduced LSTS 
is finite, it suffices to reason in L about every cycle instead of every infinite 
path. The combination of I and L helps to preserve divergences (infinite paths 
containing only invisible actions). 

Conditions DO and D2 together imply D2w, and thus every strong stubborn 
set is also a weak stubborn set. Since the reverse does not necessarily hold, weak 
stubborn sets might offer more reduction. 


2.2 Weak and Stutter Equivalence 


To reason about the similarity of an LSTS TS and its reduced LSTS TS,, we 
introduce the notions of weak equivalence, which operates on actions, and stutter 
equivalence, which operates on states. The definitions are generic, so that they 
can also be used in Section [6] 


Definition 5. Two paths n and x’ are weakly equivalent with respect to a set of 
actions A, notation t ~a x’, if and only if they are both finite or both infinite 
and their respective projections on Act \ A are equal. 
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Definition 6. The no-stutter trace under labelling L of a path so S sı “... 
is the sequence of those L(s;) such that i = 0 or L(s;) 4 L(s;-1). Paths t and 
n’ are stutter equivalent under L, notation n £r, x’, iff they are both finite or 


both infinite, and they yield the same no-stutter trace under L. 


We typically consider weak equivalence with respect to the set of invisible 
actions Z. In that case, we write m ~ 7’. We also omit the subscript for stutter 
equivalence when reasoning about the standard labelling function and write 
a £ n’. Remark that stutter equivalence is invariant under finite repetitions of 
state labels, hence its name. We lift both equivalences to LSTSs, and say that 
TS and TS" are weak-trace equivalent iff for every initial path 7 in TS, there is 
a weakly equivalent initial path 7’ in TS’ and vice versa. Likewise, TS and TS’ 
are stutter-trace equivalent iff for every initial path m in TS, there is a stutter 
equivalent initial path 7’ in TS” and vice versa. 

In general, weak equivalence and stutter equivalence are incomparable, even 
for initial paths. However, for some LSTSs, these notions can be related in a 
certain way. We formalise this in the following definition. 


Definition 7. Let TS be an LSTS and x and x’ two paths in TS that both start 
in some state s. Then, TS is labelled consistently iff n ~n’ implies 7 £ x’. 


Note that if an LSTS is labelled consistently, then in particular all weakly 
equivalent initial paths are also stutter equivalent. Hence, if an LSTS TS is 
labelled consistently and weak-trace equivalent to a subgraph 79’, then TS and 
TS’ are also stutter-trace equivalent. 

Stubborn sets as defined in the previous section aim to preserve stutter-trace 
equivalence between the original and the reduced LSTS. The motivation be- 
hind this is that two stutter-trace equivalent LSTSs satisfy exactly the same 
formulae in LTL_x. The following theorem, which is frequently cited in lit- 
erature [DIOS], aims to show that stubborn sets indeed preserve stutter-trace 
equivalence. Its original formulation reasons about the validity of an arbitrary 
LTL_ x formula. Here, we give the alternative formulation based on stutter-trace 
equivalence. 


Theorem 1. Theorem 2] Given an LSTS TS and a weak/strong stubborn 
set r, then the reduced LSTS TS, is stutter-trace equivalent to TS. 


The original proof correctly concludes that the stubborn set method preserves 
the order of visible actions in the reduced LSTS, i.e., TS ~ TS’. However, this 
only implies preservation of stutter-trace equivalence (TS & TS») if the full 
LSTS is labelled consistently, so Theorem [l]is invalid in the general case. In the 
next section, we will see a counter-example which exploits this fact. 


3 Counter-Example 


Consider the LSTS in Figure |2| which we will refer to as TS°. There is only 
one atomic proposition q, which holds in the grey states and is false in the 
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other states. The initial state § is marked with an incoming arrow. First, note 
that this LSTS is deterministic. The actions a1, a2 and ag are visible and a 
and Grey are invisible. By setting r($) = {a, akey}, which is a weak stubborn 
set, we obtain a reduced LSTS TSS that does not contain the dashed states 
and transitions. The original LSTS contains the trace 0{q}00{q}“, obtained by 
following the path with actions a,agaa%3. However, the reduced LSTS does not 
contain a stutter equivalent trace. This is also witnessed by the LTL_x formula 
(q4 > O(q V O-9)), which holds for TS, but not for TS®. 


Fig. 2: Counter-example showing that stubborn sets do not preserve stutter- 
trace equivalence. Grey states are labelled with {q}. The dashed transitions and 
states are not present in the reduced LSTS. 


A very similar example can be used to show that strong stubborn sets suffer 
from the same problem. Consider again the LSTS in Figure |2| but assume that 
a = aky, making the LSTS non-deterministic. Now, r(s) = {a} is a strong 
stubborn set and again the trace O{q}00{q}” is not preserved in the reduced 
LSTS. In Section [4.3] we will see why the inconsistent labelling problem does 
not occur for deterministic systems under strong stubborn sets. 


The core of the problem lies in the fact that condition D1, even when com- 
bined with V, does not enforce that the two paths it considers are stutter equiv- 
alent. Consider the paths s & and s “2°, and assume that a € r(s) and 
a, E r(s),a2 ¢ r(s). Condition V ensures that at least one of the following two 
holds: (i) a is invisible, or (ii) a, and ag are invisible. Half of the possible sce- 
narios are depicted in Figure [3f the other half are symmetric. Again, the grey 
states (and only those states) are labelled with {q}. 


The two cases delimited with a solid line are problematic. In both LSTSs, 
the paths s “@*, s’ and s ““™, s’ are weakly equivalent, since a is invis- 
ible. However, they are not stutter equivalent, and therefore these LSTSs are 
not labelled consistently. The topmost of these two LSTSs forms the core of 
the counter-example TS©, with the rest of TS? serving to satisfy condition 


D2/D2w. 
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a invisible 


ai and az invisible 
Fig. 3: Nine possible scenarios when a € r(s) and a; É r(s), a2 ¢ r(s), according 


to conditions D1 and V. The dotted and dashed lines indicate when a or a1, a2 
are invisible, respectively. 


4 Strengthening Condition D1 


To fix the issue with inconsistent labelling, we propose to strengthen condition 
D1 as follows. 


D1? For all a € r(s) and a; ¢ r(s),...,an ¢ 7(s), if s Sy s1 £3 --- 5, S 
si, then there are states s’,s,...,8/,_, such that s 5 s “5 s) 23- 
s! . Furthermore, if a is invisible, then s; “+ s! for every 1 <i < n. 


This new condition D1’ provides a form of local consistent labelling when one 
of a1,..-., Gp» is visible. In this case, V implies that a is invisible and, consequently, 
the presence of transitions s; = si implies L(s;) = L(s). Hence, the problematic 
cases of Figure [3] are resolved; a correctness proof is given below. 

Condition D1? is very similar to condition C1 [5], which is common in the 
context of ample sets. However, C1 requires that action a is globally indepen- 
dent of each of the actions aj,...,@,, while D1’ merely requires a kind of local 
independence. Persistent sets [7] also rely on a condition similar to D1’, and 
require local independence. 


4.1 Implementation 


In practice, most, if not all, implementations of stubborn sets approximate D1 
based on a binary relation ~>, on actions. This relation may (partly) depend on 
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the current state s and it is defined such that D1 can be satisfied by ensuring 
that if a € r(s) and a ~œ, a’, then also a’ € r(s). A set satisfying DO, D1, D2, 
D2w, V and/or I can be found by searching for a suitable strongly connected 
component in the graph (Act,~,). Condition L is dealt with by other techniques. 

Practical implementations construct ~>, by analysing how any two actions 
a and a’ interact. If a is enabled, the simplest (but not necessarily the best 
possible) strategy is to make a ~, a’ if and only if a and a’ access at least 
one variable in common. This can be relaxed, for instance, by not considering 
commutative accesses, such as writing to and reading from a FIFO buffer. As a 
result, œ~, can only detect reduction opportunities in (sub)graphs of the shape 


ay n 
Ss > S1 tee > Sn-1 > Sn 
| a | a | a | a 
a a 
where a € r(s) and a; ¢ r(s),...,an  r(s). The presence of the vertical a tran- 
sitions in s$1,...,Sn—1 implies that D1’ is also satisfied by such implementations. 


4.2 Correctness 


To show that D1’ indeed resolves the inconsistent labelling problem, we repro- 
duce the construction in the original proof Construction 1] in two lemmata 
and show that it preserves stutter equivalence. Below, recall that —,. indicates 
which transitions occur in the reduced state space. 


Lemma 1. Letr be a weak stubborn set, where condition D1 is replaced by D1’, 
and T = so £5 --- £5 sn + sl, a path such that a, ¢ r(so),...,an E r(so) and 
a € r(so). Then, there is a path T! = so &, 8) “> +--+ “*) si, such that r £w. 


Proof. The existence of x’ follows directly from condition D1’. Due to condition 
V and our assumption that a; ¢ r(so),...,@n ¢ r(so), it cannot be the case that 
a is visible and at least one of a1,...,@,, is visible. If a is invisible, then the traces 
of so “5 --- “*) sn and sj “*y--- “*) s! are equivalent, since D1’ implies that 
si — si for every 0 < i < n, so L(s/) = L(s;). Otherwise, if all of a1,..., an are 
invisible, then the sequences of labels observed along m and x’ have the shape 
L(so)"*'L(sg) and L(so)L(so)"*', respectively. We conclude that 7 4 m. 


Lemma 2. Letr be a weak stubborn set, where condition D1 is replaced by D1’, 
and T = so = sı “5... a path such that a; ¢ r(so) for any a; that occurs in 
mw. Then, the following holds: 


— If m is of finite length n > 0, there exist an action akey, a state sl, such that 
Qkey os 1 Okey 1 ai an, of 

Sa —; si, and a path 7’ = 8 +, 89 —> tS. 

— If m is infinite, there exists a path n! = so =, si £ 


action Akey. 


a2 


+, si >... for some 


In either case, 7 & v. 
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Proof. Let K be the set of key actions in s. If a; is invisible, K contains at least 
one invisible action, due to I. Otherwise, if a, is visible, we reason that K is not 
empty (condition D2w) and all actions in r(s9), and thus also all actions in K, 
are invisible, due to V. In the remainder, let akey be an invisible key action. 

In case 7 has finite length n, the existence of sn =% s’ and so oe 55 
-++ n, s! follows from the definition of key actions and D1’, respectively. 

If 7 is infinite, we can apply the definition of key actions and D1’ successively 
to obtain a path m; = so = sh > ++: 5 sf for every i > 0, with sj aie, si for 
every 1 < j < i. Since the reduced state space is finite, infinitely many of these 
paths must use the same state as sj. At most one of them ends at sj (the one 
with i = 0), so infinitely many continue from sp. Of them, infinitely many must 
use the same s}, again because the reduced state space is finite. Again, at most 
one of them is lost because of ending at s}. This reasoning can continue without 
limit, proving the existence of n’ = so “5, sh £5 si, £3 ..., with sj 2% si 
for every j > 0. 


Since akey is invisible, we have L(s;) = L(s}) for every j > 0. This implies 


nan’. 


Lemmata [1] and [2] coincide with branches 1 and 2 of [I7] Construction 1], 
respectively, but contain the stronger result that m 4 7’. Thus, when applied 
in the proof of [I7] Theorem 2] (see also Theorem [1}, this yields the result that 


stubborn sets with condition D1’ preserve stutter-trace equivalence. 


Theorem 2. Given an LSTS TS and weak/strong stubborn set r, where condi- 
tion D1 is replaced by D1’, then the reduced LSTS TS, is stutter-trace equivalent 
to TS. 


We do not reproduce the complete proof, but provide insight into the appli- 
cation of the lemmata with the following example. 


Example 1. Consider the path obtained by following a,azga3 in Figure [4] Lem- 
mata [I] and [2] show that a,a2a3 can always be mimicked in the reduced LSTS, 
while preserving stutter equivalence. In this case, the path is mimicked by the 
path corresponding to Akey A241 ükey43, drawn with dashes. The new path reorders 
the actions a1, a2 and ag according to the construction of Lemma [I] and intro- 
duces the key actions akey and akey according to Lemma 2] 


We remark that Lemma |2| also holds if the reduced LSTS is infinite, but 
finitely branching. 


4.3 Deterministic LSTSs 


As already noted in Section 3] strong stubborn sets for deterministic systems do 
not suffer from the inconsistent labelling problem. The following lemma, which 
also appeared as Lemma 4.2], shows why. 


Lemma 3. For deterministic LSTSs, conditions D1 and D2 together imply 
D1’. 
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Pas Pd F tZ 


Fig. 4: Example of how the trace a1, a2, a3 can be mimicked by introducing 
additional actions and moving az to the front (dashed trace). Transitions that 
are drawn in parallel have the same label. 


5 Safe Logics 


In this section, we will identify two logics, viz. reachability and CTL_ x, which 
are not affected by the inconsistent labelling problem. This is either due to their 
limited expressivity or the extra POR conditions that are required. 


5.1 Reachability properties 


Although the counter-example of Section [3]shows that stutter-trace equivalence 
is in general not preserved by stubborn sets, some fragments of LTL_x are 
preserved. One such class of properties is reachability properties, which are of 
the shape Of or Of, where f is a formula not containing temporal operators. 


Theorem 3. Let TS be an LSTS, r a reduction function that satisfies either 
DO, D1, D2, V and L or D1, D2w, V and L and TS, the reduced LSTS. For 
all possible labellings | C AP, TS contains a path to a state s such that L(s) =l 
iff TS, contains a path to a state s' such that L(s’) = l. 


Proof. The ‘if’ case is trivial, since TS, is a subgraph of TS. For the ‘only if’ case, 
we reason as follows. Let TS = (S, —, 8, L) bean LSTS and 7 = so “5 --+ “5 spn 
a path such that sọ = §. We mimic this path by repeatedly taking some enabled 
action a that is in the stubborn set, according to the following schema. Below, we 
assume the path to be mimicked contains at least one visible action. Otherwise, 
its first state would have the same labelling as sn. 


1. If there is an i such that a; € r(so), we consider the smallest such i, i.e., 


a, € r(So),---,@i-1  r(so). Then, we can shift a; forward by D1, move 
towards sn along so => sh and continue by mimicking sj 25 --- = 


Qi+i an 
Si ===} = Sm: 


2. If all of ay ¢ r(so),...,an É r(so), then, by DO and D2 or by D2w, there 
is a key action akey in sọ. By the definition of key actions and D1, akey leads 
to a state sh from which we can continue mimicking the path sj) #5 s1 £3 
--+ Sn, s}. Note that L(sn) = L(s',), since akey is invisible by condition V. 
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The second case cannot be repeated infinitely often, due to condition L. Hence, 
after a finite number of steps, we reach a state si, with L(s/,) = L(s,). 


We remark that more efficient mechanisms for reachability checking under 
POR have been proposed, such as condition S [21], which can replace L, or 
conditions based on up-sets [13]. Another observation is that model checking 
of LTL_x properties can be reduced to reachability checking by computing the 
cross-product of a Biichi automaton and an LSTS [I], in the process resolving 
the inconsistent labelling problem. Peled shows how this approach can be 
combined with POR, but please see [14]. 


5.2 Deterministic LSTSs and CTL_x Model Checking 


In this section, we will consider the inconsistent labelling problem in the set- 
ting of CTL_x model checking. When applying stubborn sets in that context, 
stronger conditions are required to preserve the branching structure that CTL_x 
reasons about. Namely, the original LSTS must be deterministic and one more 
condition needs to be added [5]: 


C4 Either r(s) = Act or r(s) N enabled(s) = {a} for some a € Act. 


We slightly changed its original formulation to match the setting of stubborn 
sets. A weaker condition, called A8, which does not require determinism of 
the whole LSTS is proposed in [19]. With C4, strong and weak stubborn sets 
collapse, as shown by the following lemma. 


Lemma 4. Conditions D2w and C4 together imply DO and D2. 


Proof. Let TS be an LSTS, s a state and r a reduction function that satisfies 
D2w and C4. Condition DO is trivially implied by C4. Using C4, we distinguish 
two cases: either r(s) contains precisely one enabled action a, or r(s) = Act. In 
the former case, this single action a must be a key action, according to D2w. 
Hence, D2, which requires that all enabled actions in r(s) are key actions, is 
satisfied. Otherwise, if r(s) = Act, we consider an arbitrary action a that sat- 
isfies D2’s precondition that s “5. Given a path s ““"s, the condition that 
a, € r(s),...,an É r(s) only holds if n = 0. We conclude that D2’s condition 
s £t, is satisfied by the assumption s $. 


It follows from Lemmata[3|and[4]and Theorem[2|that CTL_ x model checking 
of deterministic systems with stubborn sets does not suffer from the inconsistent 
labelling problem. The same holds for condition A8, as already shown in [19]. 


6 Petri Nets 


Petri nets are a widely-known formalism for modelling concurrent processes and 
have seen frequent use in the application of stubborn-set theory [4[10/21)22). 
A Petri net contains a set of places P and a set of structural transitions T. 
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Arcs between places and structural transitions are weighted according to a total 
function W : (PxT)U(T xP) > N. The state space of the underlying LSTS is the 
set M of all markings; a marking m is a function P > N, which assigns a number 
of tokens to each place. The LSTS contains a transition m + m’ iff m(p) > 
W (p, t) and m’(p) = m(p) —W(p,t) + W(t, p) for all places p € P. As before, we 
assume the LSTS contains some labelling function L : M — 24”. More details 
on the labels are given below. Note that markings and structural transitions take 
over the role of states and actions respectively. The set of markings reachable 
under — from some initial marking M is denoted M preach- 


Example 2. Consider the Petri net with initial marking 7 below on the left. 
Here, all arcs are weighted 1, except for the arc from ps to t2, which is weighted 
2. Its LSTS is infinite, but the reachable substructure is depicted on the right. 
The number of tokens in each of the places p;,...,p¢ is inscribed in the nodes, 
the state labels (if any) are written beside the nodes. 


{ap} 


p2 
pı o ty 
@—I l 


t2 


tkey p4 P3 


[k C) > P5 


t 


{ap} 


1 t2 
P6 E {3 101010 010010 001020 001000 


{a} {a} 


The LSTS practically coincides with the counter-example of Section B] Only the 
self-loops are missing and the state labelling, with atomic propositions q, qp and 
qi, differs slightly; the latter will be explained later. For now, note that t and tey 
are invisible and that the trace 0{q,}00{q}, which occurs when firing transitions 
ti tgttz from m, can be lost when reducing with weak stubborn sets. 


In the remainder of this section, we fix a Petri net (P, T, W, ñ) and its LSTS 
(M, =, m, L). Below, we consider three different types of atomic propositions. 
Firstly, polynomial propositions [4| are of the shape f(pi,...,pPn) > k where f is 
a polynomial over p1,...,Pn, XE {<,<,>,>,=,4} and k € Z. Such a proposi- 
tion holds in a marking m iff f(m(p1),...,m(pn)) & k. A linear proposition [10] 
is similar, but the function f over places must be linear and f(0,...,0) = 0, i.e., 
linear propositions are of the shape kyp,+---+knppn X k, where ky,..., kn, k E€ Z. 
Finally, we have arbitrary propositions [22], whose shape is not restricted and 
which can hold in any given set of markings. 

Several other types of atomic propositions can be encoded as polynomial 
propositions. For example, fireable(t) [4{10], which holds in a marking m iff t is 
enabled in m, can be encoded as Į] ep EP” p — i) > 1. The proposition 
deadlock, which holds in markings where no structural transition is enabled, does 
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not require special treatment in the context of POR, since it is already preserved 
by D1 and D2w. The sets containing all linear and polynomial propositions 
are henceforward called AP; and AP,, respectively. The corresponding labelling 
functions are defined as Lı(m) = L(m) N AP; and L,(m) = L(m) N AP, for all 
markings m. Below, the two stutter equivalences £;, and £ L, that follow from 
the new labelling functions are abbreviated £, and £,, respectively. Note that 
AP 2 AP, 2 AP; and C£, C£. 

For the purpose of introducing several variants of invisibility, we reformulate 
and generalise the definition of invisibility from Section |2} Given an atomic 
proposition q € AP, a relation R C M x M is q-invisible if and only if 
(m,m’) € R implies q E€ L(m) = q € L(m’). We consider a structural transi- 
tion t q-invisible iff its corresponding relation {(m, m’) | m +s m’} is q-invisible. 
Invisibility is also lifted to sets of atomic propositions: given a set AP’ C AP, 
relation R is AP’ -invisible iff it is q-invisible for all q € AP’. If R is AP-invisible, 
we plainly say that R is invisible. AP’-invisibility and invisibility carry over to 
structural transitions. We sometimes refer to invisibility as ordinary invisibility 
for emphasis. Note that the set of invisible structural transitions Z is no longer 
an under-approximation, but contains exactly those structural transitions t for 
which m +; m’ implies L(m) = L(m’) (cf. Section [2). 

We are now ready to introduce three orthogonal variations on invisibil- 
ity. Firstly, relation R C M x M is reach q-invisible [2I] iff RO (Mreach X 
M reach) is q-invisible, i.e., all the pairs of reachable markings (m,m’) E R 
agree on the labelling of q. Secondly, R is value q- 
invisible if (i) q is polynomial and for all (m,m’) € R, 


f(m(p1),---,MPn)) = f(m'(p1),--.,m'(Pn)); or if Tsy 

(ii) q is not polynomial and R is q-invisible. Intu- Pa } 
itively, this means that the value of polynomial f Tes La, Ta 
never changes between two markings (m, m’) E€ R. | xX X | 
Reach and value invisibility are lifted to structural T; JA Li 
transitions and sets of atomic propositions as before, SG | a 
ie., by taking R = {(m,m’) | m +s m} when con- T" 


sidering invisibility of t. Finally, we introduce an- 

other way to lift invisibility to structural transitions: Fig. 5: Lattice of sets of 
t is strongly q-invisible iff the set {(m,m’) | Vp € invisible actions. Arrows 
P : m (p) = m(p) + W(t, p) — W (p, t)} is q-invisible. represent a subset rela- 
Strong invisibility does not take the presence of a tion. 

transition m + m’ into account, and purely reasons 

about the effects of t. Value invisibility and strong in- 

visibility are new in the current work, although strong invisibility was inspired 
by the notion of invisibility that is proposed by Varpaaniemi in [22]. 

We indicate the sets of all value, reach and strongly invisible structural tran- 
sitions with Z,, Z” and Z, respectively. Since Z, C Z, Z, C Z and Z C T”, the 
set of all their possible combinations forms the lattice shown in Figure |5| In 
the remainder, the weak equivalence relations that follow from each of the eight 
invisibility notions are abbreviated, e.g., ~zr, becomes ~{,,. 
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E _.. Theorem [6] > £ 


Fig. 6: Two lattices containing variations of weak equivalence and stutter equiv- 
alence, respectively. Solid arrows indicate a subset relation inside the lattice; 
dotted arrows follow from the indicated theorems and show when the LSTS of 
a Petri net is labelled consistently. 


Example 3. Consider again the Petri net and LSTS from Example |2| We can 
define q and qp as linear and polynomial propositions, respectively: 


— qı := p3 + p4 + pẹ = 0 is a linear proposition, which holds when neither 
P3, Pa nor pe contains a token. Structural transition t is q-invisible, because 
m = m implies that m(p3) = m’/(p3) > 1, and thus neither m nor m 
is labelled with qı. On the other hand, t is not value q-invisible (by the 
transition 101100 5 101010) or strongly reach q-invisible (by 010100 and 
010010). However, tkey is strongly value q-invisible: it moves a token from 
pa to pe and hence never changes the value of p3 + p4 + pe. 

— qp := (1 — p3)(1 — ps) = 1 is a polynomial proposition, which holds in all 
reachable markings m where m(p3) = 0 and m(ps) = 0. Structural transition 
t is reach value qp-invisible, but not qp-invisible (by 002120 4 002030) or 
strongly reach qp invisible. Strong value qp-invisibility of they follows imme- 
diately from the fact that the adjacent places of trey, viz. p4 and pe, do not 
occur in the definition of qp. 


This yields the state labelling which is shown in Example 


Given a weak equivalence relation R~ and a stutter equivalence relation Ra, 
we write RL < Ra to indicate that RL and Ra yield consistent labelling. We 
spend the rest of this section investigating under which notions of invisibility and 
propositions from the literature, the LSTS of a Petri net is labelled consistently. 
More formally, we check for each weak equivalence relation R~ and each stutter 
equivalence relation Ra whether RL < Ra. This tells us when existing stubborn 
set theory can be applied without problems. The two lattices containing all weak 
and stuttering equivalence relations are depicted in Figure [6] each dotted arrow 
represents a consistent labelling result. Before we continue, we first introduce an 
auxiliary lemma. 


Lemma 5. Let I be a set of invisible structural transitions and L some labelling 


function. If for allt € I and paths t = mo 4 mı 2)... and x’ = mo 4 mo en 


mi, 43 ..., it holds that n 4, 7’, then~;<A,. 
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Proof. We assume that the following holds for all paths and t € I: 
mo “25 my 5-4; my b mh 4 mi... (t) 


We consider two initial paths 7 and x’ such that m ~; 7’ and prove that 7 4, m. 
The proof proceeds by induction on the combined number of invisible structural 
transitions (taken from J) in 7 and 7’. In the base case, 7 and 7’ contain only 
visible structural transitions, and 7 ~; m’ implies m = 7’ since Petri nets are 
deterministic. Hence, 7 £; m. 

For the induction step, we take as hypothesis that, for all initial paths 7 
and 7’ that together contain at most k invisible structural transitions, 7 ~; 7’ 
implies 7 4, n’. Let m and x’ be two arbitrary initial paths such that m ~r n’ 
and the total number of invisible structural transitions contained in m and 7’ is 
k. We consider the case where an invisible structural transition is introduced in 
n’, the other case is symmetric. Let 2’ = 0102 for some gı and o2. Let t € I be 
some invisible structural transition and 7” = o,to4 such that c2 and of contain 
the same sequence of structural transitions. Clearly, we have 7’ ~r n”. Here, we 
can apply our original assumption (i), to conclude that o2 £ toh, i.e., the extra 
stuttering step t thus does not affect the labelling of the remainder of 7’’. Hence, 
we have 7’ £; 7” and, with the induction hypothesis, 7 2, n”. Note that 7 and 
T” together contain k + 1 invisible structural transitions. 

In case m and x’ together contain an infinite number of invisible structural 
transitions, 7 ~z 7’ implies m £; 7’ follows from the fact that the same holds 
for all finite prefixes of m and 7’ that are related by ~z. 


The following theorems each focus on a class of atomic propositions and 
show which notion of invisibility is required for the LSTS of a Petri net to be 
labelled consistently. In the proofs, we use a function d;, defined as d;(p) = 
W(t,p) — W(p,t) for all places p, which indicates how structural transition t 
changes the state. Furthermore, we also consider functions of type P —> N as 
vectors of type NI?!. This allows us to compute the pairwise addition of a marking 
m with d: (m + d+) and to indicate that t does not change the marking (d = 0). 


Theorem 4. Under reach value invisibility, the LSTS underlying a Petri net is 
labelled consistently for linear propositions, i.e., ~% < £. 


Proof. Let t € Ty be a reach value invisible structural transition such that there 
exist reachable markings m and m’ with m +; m’. If such a t does not exist, 
then ~? is the reflexive relation and ~f < 4, is trivially satisfied. Otherwise, let 
q := f(pi,---;Pn) DX k be a linear proposition. Since ¢ is reach value invisible 
and f is linear, we have f(m) = f(m) = f(m + d:i) = f(m) + f(d:) and 


2 


thus f(d) = 0. It follows that, given two paths 7 = mo ü, m, ©; ... and 
T =m $ mo EN mi ta, ..., the addition of t does not influence f, since 
f(m:) = f(mi) + f(d) = f(m; + di) = f(m;) for all i. As a consequence, t also 


does not influence q. With Lemma b] we deduce that ~? < 4). 


Whereas in the linear case one can easily conclude that m and 7’ are stutter 
equivalent under f, in the polynomial case, we need to show that f is constant 
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under all value invisible structural transitions t, even in markings where t is not 
enabled. This follows from the following proposition. 


Proposition 1. Let f : N” — Z be a polynomial function, a,b € N” two con- 
stant vectors and c =a — b the difference between a and b. Assume that for all 
x € N” such that x > b, where > denotes pointwise comparison, it holds that 
f(x) = f(x +c). Then, f is constant in the vector c, i.e., f(a) = f(x +c) for all 
x EN”. 


Proof. Let f, a, b and c be as above and let 1 € N” be the vector containing 
only ones. Given some arbitrary x € N”, consider the function g,(t) = flx +t- 
1+c)— f(~+t-1). For sufficiently large t, it holds that x +t- 1 > b, and it 
follows that gz(t) = 0 for all sufficiently large t. This can only be the case if gy. 
is the zero polynomial, i.e., gs(t) = 0 for all t. As a special case, we conclude 
that gz(0) = f(x +c) — f(x) =0. 


The intuition behind this is that f(x +c) — f(x) behaves like the directional 
derivative of f with respect to c. If the derivative is equal to zero in infinitely 
many x, f must be constant in the direction of c. We will apply this result in 
the following theorem. 


Theorem 5. Under value invisibility, the LSTS underlying a Petri net is la- 
belled consistently for polynomial propositions, i.e., ~y < Š 


Proof. Let t € T, be a value invisible structural transition, m and m’ two mark- 
ings with m + m’, and q := f(p1,..-,Pn) D< k a polynomial proposition. Note 
that infinitely many such (not necessarily reachable) markings exist in M, so we 
can apply Proposition[I]to obtain f(m) = f(m-+d;) for all markings m. It follows 
that, given two paths 7 = mp © mı 2... and T’ = mo $ mo temi B a 
the addition of t does not alter the value of f, since f(m;) = f(mi + di) = f(m,) 
for all i. As a consequence, t also does not change the labelling of q. Application 
of Lemma [5] yields aS 


Varpaaniemi shows that the LSTS of a Petri net is labelled consistently 
for arbitrary propositions under his notion of invisibility Lemma 9]. Our 
notion of strong visibility, and especially strong reach invisibility, is weaker than 
Varpaaniemi’s invisibility, so we generalise the result to ~% < £. 


Theorem 6. Under strong reach visibility, the LSTS underlying a Petri net is 
labelled consistently for arbitrary propositions, i.e., ~ < £. 


Proof. Let t € T} be a strongly reach invisible structural transition and 7 = 

tı t2 E t 1 tı 1 t2 n Q; — 
mo = mı —>... and T mo > mo —> mi Æ ... two paths. Since, m; = 
m, +d; for all i, it holds that either (i) d; = 0 and m; = m, for all i; or (ii) each 
pair (m;, m4) is contained in {(m,m’) | Vp € P : m'(p) = m(p) + W(t, p) — 


W (p,t)}, which is the set that underlies strong reach invisibility of t. In both 
cases, L(m;) = L(m;) for all i. It follows from Lemma 5] that aT XS. 
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To show that the results of the above theorems cannot be strengthened, we 
provide two negative results. 


Theorem 7. Under ordinary invisibility, the LSTS underlying a Petri net is 
not necessarily labelled consistently for arbitrary propositions, i.e., ~ KA. 


Proof. Consider the Petri net from Example |2| with the arbitrary proposition 
qı. Disregard qp for the moment. Structural transition t is q-invisible, hence the 
paths corresponding to t,tgtt3 and tt,tgtz are weakly equivalent under ordinary 
invisibility. However, they are not stutter equivalent. 


Theorem 8. Under reach value invisibility, the LSTS underlying a Petri net is 
not necessarily labelled consistently for polynomial propositions, i.e., ~y, = 

Proof. Consider the Petri net from Example 22] with the polynomial proposition 
qp := (1—p3)(1—ps) = 1 from Example[3| Disregard q in this reasoning. Struc- 
tural transition ¢ is reach value q,-invisible, hence the paths corresponding to 
ti tgttz and tt,tgt3 are weakly equivalent under reach value invisibility. However, 
they are not stutter equivalent for polynomial propositions. 


It follows from Theorems [7] and [8] and transitivity of C that Theorems [4] 
and [6] cannot be strengthened further. In terms of Figure [6] this means that the 
dotted arrows cannot be moved downward in the lattice of weak equivalences and 
cannot be moved upward in the lattice of stutter equivalences. The implications 
of these findings on related work will be discussed in the next section. 


7 Related Work 


There are many works in literature that apply stubborn sets. We will consider 
several works that aim to preserve LTL_x and discuss whether they are correct 
when it comes to the problem presented in the current work. 

Liebke and Wolf present an approach for efficient CTL model check- 
ing on Petri nets. For some formulas, they can reduce CTL model checking to 
LTL model checking, which allows greater reductions under POR. They rely 
on the incorrect LTL preservation theorem, and since they apply the tech- 
niques on Petri nets with ordinary invisibility, their theory is incorrect (The- 
orem F}. Similarly, the overview of stubborn set theory presented by Valmari 
and Hansen in applies reach invisibility and does not necessarily preserve 
LTL_x. Varpaaniemi also applies stubborn sets to Petri nets, but relies on 
a visibility notion that is stronger than strong invisibility. The correctness of 
these results is thus not affected (Theorem (6). The approach of Bgnneland et 
al. [4| operates on two-player Petri nets, but only aims to preserve reachability 
and consequently does not suffer from the inconsistent labelling problem. 

A generic implementation of weak stubborn sets is proposed by Laarman 
et al. [9]. They use abstract concepts such as guards and transition groups to 
implement POR in a way that is agnostic of the input language. The theory 
they present includes condition D1, which is too weak, but the accompanying 
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implementation follows the framework of Section and thus it is correct by 
Theorem [2] The implementations proposed in 21]23] are similar, albeit specific 
for Petri nets. 

Others perform action-based model checking and thus strive to preserve 
weak trace equivalence or inclusion. As such, they do not suffer from the problems 
discussed here, which applies only to state labels. 

Although Bene’ et al. PIB] rely on ample sets, and not on stubborn sets, 
they also discuss weak trace equivalence and stutter-trace equivalence. In fact, 
they present an equivalence relation for traces that is a combination of weak 
and stutter equivalence. The paper includes a theorem 
that weak equivalence implies their new state/event 
equivalence [2] Theorem 6.5]. However, the counter- {q} 
example on the right shows that this consistent la- a 
belling theorem does not hold. Here, the action 7 is in- 
visible, and the two paths in this transition system are 
thus weakly equivalent. However, they are not stutter 
equivalent, which is a special case of state/event equiv- 
alence. Although the main POR correctness result [2] 
Corollary 6.6] builds on the incorrect consistent labelling theorem, its correctness 
does not appear to be affected. An alternative proof can be constructed based 
on Lemmas [I] and B} 

The current work is not the first to point out mistakes in POR theory. In [14], 
Siegel presents a flaw in an algorithm that combines POR and on-the-fly model 
checking [12]. In that setting, POR is applied on the product of an LSTS and a 
Butchi automaton. Let q be a state of the LSTS and s a state of the Büchi au- 
tomaton. While investigating a transition (q, s) “5 (q’, s"), condition C3, which— 
like condition L—aims to solve the action ignoring problem, incorrectly sets 
r(q, 8’) = enabled(q) instead of r(q, s) = enabled(q). 


8 Conclusion 


We discussed the inconsistent labelling problem for preservation of stutter-trace 
equivalence with stubborn sets. The issue is relatively easy to repair by strength- 
ening condition D1. For Petri nets, altering the definition of invisibility can also 
resolve inconsistent labelling depending on the type of atomic propositions. The 
impact on applications presented in related works seems to be limited: the prob- 
lem is typically mitigated in the implementation, since it is very hard to compute 
D1 exactly. This is also a possible explanation for why the inconsistent labelling 
problem has not been noticed for so many years. 

Since this is not the first error found in POR theory [I4], a more rigorous 
approach to proving its correctness, e.g. using proof assistants, would provide 
more confidence. 
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Abstract. We describe a category-theoretic semantics for a simply typed 
variant of Cocon, a contextual modal type theory where the box modal- 
ity mediates between the weak function space that is used to represent 
higher-order abstract syntax (HOAS) trees and the strong function space 
that describes (recursive) computations about them. What makes Co- 
con different from standard type theories is the presence of first-class 
contexts and contextual objects to describe syntax trees that are closed 
with respect to a given context of assumptions. Following M. Hofmann’s 
work, we use a presheaf model to characterise HOAS trees. Surprisingly, 
this model already provides the necessary structure to also model Cocon. 
In particular, we can capture the contextual objects of Cocon using a 
comonad b that restricts presheaves to their closed elements. This gives 
a simple semantic characterisation of the invariants of contextual types 
(e.g. substitution invariance) and identifies Cocon as a type-theoretic syn- 
tax of presheaf models. We express our category-theoretic constructions 
by using a modal internal type theory that is implemented in Agda-Flat. 


1 Introduction 


A fundamental question when defining, implementing, and working with languages 
and logics is: How do we represent and analyse syntactic structures? Higher-order 
abstract syntax [19] (or lambda-tree syntax [17]) provides a deceptively simple 
answer to this question. The basic idea to represent syntactic structures is to 
map uniformly binding structures in our object language (OL) to the function 
space in a meta-language thereby inheriting a-renaming and capture-avoiding 
substitution. In the logical framework LF [10], for example, we can define a small 
functional programming language consisting of functions, function application, 
and let-expressions using a type tm as follows: 


lam : (tm > tm) —> tm. letv: tm > (tm > tm) —> tm. 
app : tm > tm — tn. 


The object-language term (lam z. lam y. let w = x y in w y) is then encoded as 
lam Ax.lam Ay.letv (app x y) Aw.app w y using the LF abstractions to model 
binding. Object-level substitution is modelled through LF application; for instance, 
the fact that ((lam x.M) N) reduces to [N/a]M in our object language is expressed 
as (app (lam M) N) reducing to (M N). 

This approach is elegant and can offer substantial benefits: we can treat objects 
equivalent modulo renaming and do not need to define object-level substitution. 
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However, we not only want to just construct HOAS trees, but also to analyse 
them and to select sub-trees. This is challenging, as sub-trees are context sensitive. 
For example, the term letv (app x y) Aw.app w y only makes sense in a context 
x:tm,y:tm. Moreover, one cannot simply extend LF to allow syntax analysis. If 
one simply added a recursion combinator to LF, then it could be used to define 
many functions M: tm — tm for which lam M would not represent an object-level 
syntax term [12]. 


Contextual types [18,20] offer a type-theoretic solution to these problems by 
reifying the typing judgement, i.e. that Letv (app x y) Aw.app w y has type tm in 
the context x:tm,y:tm, as a conteztual type [a:tm, y:tm + tm]. The contextual type 
[a:tm, y:tm F tm] describes a set of terms of type tm that may contain variables 
x and y. In particular, the contextual object [x,y H letv (app x y) Aw.app w y] 
has the given contextual type. By abstracting over contexts and treating contexts 
as first-class, we can now recursively analyse HOAS trees [20,25,21]. Recently, 
[23] further generalised these ideas and presented a contextual modal type 
theory, Cocon, where we can mix HOAS trees and computations, i.e. we can use 
(recursive) computations to analyse and traverse (contextual) HOAS trees and we 
can embed computations within HOAS trees. This line of work provides a syntactic 
perspective to the question of how to represent and analyse syntactic structures 
with binders, as it focuses on decidability of type checking and normalisation. 
However, its semantics remains not well-understood. What is the semantic 
meaning of a contextual type? Can we semantically justify the given induction 
principles? What is the semantics of a first-class context? 


While a number of closely related categorical models of abstract syntax with 
bindings [12,8,9] were proposed around 2000, the relationship of these models 
to concrete type-theoretic languages for computing with HOAS structures was 
teneous. In this paper, we give a category-theoretic semantics for Cocon (for 
simply-typed HOAS). This provides semantic perspective of contextual types 
and first-class contexts. Maybe surprisingly, the presheaf model introduced by 
Hofmann [12] already provides the necessary structure to also model contextual 
modal type theory. Besides the standard structure of this model, we only need two 
additional concepts: a b-modality and a cartesian closed universe of representables. 
For simplicity and lack of space, we focus on the special case of Cocon where 
the HOAS trees are simply-typed. Concentrating on the simply-typed setting 
allows us to introduce the main idea without the additional complexity that type 
dependencies bring with them. We outline the dependently-typed case in Sec. 6. 


Our work provides a semantic foundation to Cocon and can serve as a starting 
point to investigate connections to similar work. First, our work connects Cocon 
to other work on internal languages for presheaf categories with a b-modality, 
such as spatial type theory [27] or crisp type theory [16]. Second, it may help 
to understand the relations of Cocon to type theories that use a modality for 
metaprogramming and intensional recursion, such as [15]. While Cocon is built 
on the same general ideas, a main difference seems to be that Cocon distinguishes 
between HOAS trees and computations, even though it allows mixed use of them. 
We hope to clarify the relation by providing a semantical perspective. 
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2 Presheaves for Higher-Order Abstract Syntax 


Our work begins with the presheaf models for HOAS of [12,8]. The key idea of 
those approaches is to integrate substitution-invariance in the computational 
universe in a controlled way. For the representation of abstract syntax, one wants 
to allow only substitution-invariant constructions. For example, Lam M represents 
an object-level abstraction if and only if M is a function that uses its argument in 
a substitution-invariant way. For computation with abstract syntax, on the other 
hand, one wants to allow non-substitution-invariant constructions too. Presheaf 
categories allow one to choose the desired amount of substitution-invariance. 

Let D be a small category. The presheaf category D is defined to be the 
category Set?” . Its objects are functors F: D°P — Set, which are also called 
presheaves. Such a functor F is given by a set F(W) for each object Y of D 
together with a function F (o): F(®) > F(Y) for any object ® and o: Y > @ in 
D, subject to the functor laws. The intuition is that F defines sets of elements in 
various D-contexts, together with a D-substitution action. A morphism f: F > G 
is a natural transformation, which is a family of functions fy: F(W) > G(W) for 
any W. This family of functions must be natural, i.e. commute with substitution 
fu 0 F(a) = F(a) fo. 

For the purposes of modelling higher-order abstract syntax, D will typically 
be the term model of some domain-level lambda-calculus. By domain-level, we 
mean the calculus that serves as the meta-level for object-language encodings. It 
is the calculus that contains constants like lam and app from the Introduction. We 
call it domain-level to avoid possible confusion between different meta-levels later. 
For simplicity, let us for now use a simply-typed lambda-calculus with functions 
and products as the domain language. It is sufficient to encode the example from 
the Introduction and allows us to explain the main idea underlying our approach. 

The term model of the simply-typed domain-level lambda-calculus forms a 


cartesian closed category D. The objects of D are contexts 71: A1,...,%n: An 
of simple types. We use ® and W to range over such contexts. A morphism 
from z1: Á1,..., Zn: Án to z1: B1,...,£m: Bm is a tuple (t1,...,tm) of terms 


z1: Ái,- .., Zn: Án F ti: Bi for i = 1,...,m. A morphism of type Y > @ in D 
thus amounts to a (domain-level) substitution that provides a (domain-level) 
term in context W for each of the variables in ®. Terms are identified up to 
aBn-equality. One may achieve this by using a de Bruijn encoding, for example, 
but the specific encoding is not important for this paper. The terminal object is 
the empty context, which we denote by 1, and the product ® x W is defined by 
context concatenation. It is not hard to see that any object 71: A1,...,%n:An 
is isomorphic to an object that is given by a context with a single variable, 
namely z1: (Aı x --- X An). This is to say that contexts can be identified with 
product types. In view of this isomorphism, we shall allow ourselves to consider 
the objects of D also as types and vice versa. The category D is cartesian closed, 
the exponential of ® and W being given by the function type > Y (where the 
objects are considered as types). 

The presheaf category Disa computational universe that both embeds the 
term model D and that can represent computations about it. Note that we cannot 
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just enrich D with terms for computations if we want to use HOAS. In a simply- 
typed lambda-calculus with just the constant terms app: tm > tm — tm and 
lam: (tm —> tm) — tm, each term of type tm represents an object-level term. This 
would not be the true anymore, if we were to allow computations in the domain 
language, since one could define M to be something like (Ax. if x represents 

an object-level application then M1 else M2) for distinct M1 and M2. In this 
case, lam M would not represent an object-level term anymore. If we want to 
preserve a bijection between the object-level terms and their representations 
in the domain-language, we cannot allow case-distinction over whether a term 
represents an object-level an application. 

The category D unites syntax with computations by allowing one to enforce 
various degrees of substitution-invariance. By choosing objects with different sub- 
stitution actions, one can control the required amount of substitution-invariance. 

In one extreme, a set S can be represented by the constant presheaf AS with 
AS(W) = S and AS(c) = id for all W and ø. The substitution action is trivial. 
As a consequence, a morphism AS —> AT amounts to a function from set S to 
set T, since the trivial choice of the substitution action makes the naturality 
condition vacuous. 

The Yoneda embedding represents the other extreme. For any object ® of D, 
the presheaf y(®@): D°P — Set is defined by y(®)(W) = D(X, 8), which is the set of 
morphisms from W to ® in D. The functor action is pre-composition. The presheaf 
y(®) should be understood as the type of all domain-level substitutions with 
codomain ®. An important example is Tm := y(tm). In this case, Tm(W) is the set 
of all morphisms of type YW —> tm in D. By the definition of D, these correspond 
to domain-level terms of type tm in context W. In this way, the presheaf Tm 
represents the domain-level terms of type tm. 

The Yoneda embedding does in fact embed D into D fully and faithfully. The 
Yoneda embedding becomes a functor y: D > D if one defines the morphism 
action to be post-composition. This means that y maps a morphism o: V > @ 
in D to the natural transformation y(c): y(W) —> y(®) that is defined | by post- 
composing with ø. This definition makes y into a functor y: D > D that is 
moreover full and faithful: its action on morphisms is a bijection from D(Y, 8) 
to D(y(W),y(®)) for any Y and @. This is because a natural transformation 
f: y(W) > y(®) is, by naturality, uniquely determined by fw(id), where id € 
D(X, Y) = y(W)(W), and fg(id) is an element of y(@)(W) = D(X, &). 

Since D embeds into | D fully and faithfully, the term model of the domain 
language is available in D. Consider for example Tm = y(tm). Since y is full and 


faithful, the morphisms from Tm to Tm in D are in one-to-one correspondence with 
the morphisms from tm to tm in D. These, in turn, are defined to be substitutions 
and correspond to simply-typed (domain-level) lambda terms with one free 
variable. This shows that substitution invariance cuts down the morphisms from 
Tm to Tm in D just as much as one would like for HOAS encodings. 

But D contains not just a term model of the domain language. It can also 
represent computations about the domain-level syntax and computations that 
are not substitution-invariant. For example, arbitrary functions on terms can 
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be represented as morphisms from the constant presheaf A(Tm(1)) to Tm. Recall 
that 1 is the empty context, so that Tm(1) is the set D(1, tm), by definition, which 
is isomorphic to the set of closed domain-level terms of type tm. The morphisms 
from A(Tm(1)) to Tm in D correspond to arbitrary functions from closed terms to 
closed terms, without any restriction of substitution invariance. 

The restriction to the constant presheaf_of closed terms can be generalised to 
arbitrary presheaves. Define a functor b: D > D by letting bF be the constant 
presheaf A(F(1)), i.e. bDF(W) = F(1) and bF (oc) = id. Thus, b restricts any 
presheaf to the set of its closed elements. The functor b defines a comonad 
where the counit £p: bF — F is the obvious inclusion and the comultiplication 
vp: bF — bbF is the identity. The latter means that the comonad b is idempotent. 


3 Internal Language 


To explain how D models higher-order abstract syntax and contextual types, we 
need to expose more of its structure. Most of this structure is standard. Defining 
it directly in terms of functors and natural transformations is somewhat laborious 
and the technical details may obscure the basic_ idea of our approach. 

We therefore use the internal type theory of D as a meta-language for working 
with its structure. The structure of D furnishes a model of a dependent type theory 
that supports dependent products, dependent sums and extensional identity types, 
among others, in a standard way [11]. We use Agda notation for the types and 
terms of this internal type theory. We write (x: S) — T for a dependent function 
type and write Ax: S.m and m n for the associated lambda-abstractions and 
applications. As usual, we will sometimes also write S —> T for (x: S) > T if x 
does not appear in T. However, to make it easier to distinguish the function 
spaces at various levels, we will write (x: S) —> T by default even when x does 
not appear in T. We use let x = m in n as an abbreviation for (Ax: T.n) m, 
as usual. For two terms m:T and n:T, we write m =p n or just m =n for the 
associated identity type. Our notation is similar to Agda’s, since the internal type 
theory can be seen as a fragment of Agda’s type theory. Agda has been useful as 
a tool for type-checking our constructions in the internal type theory [1]. 

In the spirit of Martin-Lof type theory, we will define basic types and terms 
successively as they are needed. In the Agda development this corresponds to 
postulating constants that are justified by the interpretation in D. In the following 
sections, we will expose the structure of D step by step until we have enough to 
interpret contextual types. 

While much of the structure of D can be captured by adding rules and con- 
stants to standard Martin-Lof type theory, for the comonad b such a formulation 
would not be very satisfactory. The issues are discussed by Shulman [27, p.7], for 
example. To obtain a more satisfactory syntax for the comonad, we refine the 
internal type theory into a modal type theory in which b appears as a necessity 
modality. This approach goes back to [3,4,6] and is also used by recent work of 
Shulman [27], Licata et al. [16] and others on working with the b-modality in 
type theory. Agda has recently gained support for such a >-modality [29]. 
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We summarise here the typing rules for the b-modality which we will rely on. 
To control the modality, one uses two kinds of variables. In addition to standard 
variables z:T, one has a second kind of so-called crisp variables x::T. Typing 
judgements have the form A | O F m:T, where A collects the crisp variables 
and O collects the ordinary variables. In essence, a crisp variable x::T represents 
an assumption of the form x:bT. The syntactic distinction is useful, since it leads 
to a type theory that is well-behaved with respect to substitution, see [6,27]. 

The typing rules are closely related to those in modal type systems [6,18], 
where A is the typing context for modal (global) assumptions and © for (local) 
assumptions, and type systems for linear logic [4], where A is the typing context 
for non-linear assumptions and O for linear assumptions. 


A,u:T, A’ |OFuT A|0O,2:T,O'F «:T 
A|-Fm:T A|OFm:bT Ajaz:T|OFn: S 
A| OF box m: bT A|OF let boxx=minn:S 

Given any term m: T which only depends on modal variable context A, we can 
form the term box m: pT. We have a let-term let box x = m in n that takes 
a term m: bT and binds it to a variable z::T. The rules maintain the invariant 
that the free variables in a type bT or a term box m are all crisp variables from 
the crisp context A. 

The other typing rules do not modify the crisp context. For examples, the 
rules for dependent products are: 

AlO, TF m: S AlOF m: (yT) >S AlOFn:T 
A|OF dX2:T.m: (@:T) > S A|OFmn:[n/y]S 

When A is empty, we shall write just 9 m:T for A| OF m:T. 


4 From Presheaves to Contextual Types 


Armed with the internal type theory, we can now explore the structure of D. 


4.1 A Universe of Representables 


For our purposes, the main feature of D is that it embeds D fully and faithfully via 
the Yoneda embedding. In the type theory for D, we may capture this embedding 
by means of a Tarski-style universe. Such a universe is defined by a type of codes 
for types together with a decoding function that maps codes to actual types. 
The type of codes Obj represents the set of objects of D in the internal type 
theory of D. We have seen above that any set can be represented as a presheaf 
with trivial substitution action, and Obj is one such example. Particular objects 
of D then appear as terms of type Obj. The cartesian closed structure of D gives 
us terms unit, times, arrow for the terminal object 1, finite products x and the 
exponential (function type). We also have a term for the domain-level type tm. 


F Obj type - tm: Obj - times: (a: Obj) — (b:0bj) —> Obj 
- unit: Obj - arrow: (a:0bj) — (b: Obj) > Obj 
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Subsequently, we sometimes talk about objects of D when we intend to describe 
terms of type Obj (and vice versa). 

The morphisms of D could similarly be encoded as a constant presheaf with 
many term constants, but this is in fact not necessary. Instead, we can use the 
Yoneda embedding as a function that decodes elements of Obj into actual types. 


x: Obj FEl type 


The function E1 is almost direct syntax for the Yoneda embedding. The interpre- 
tation in D is such that, for any object A of D, the type E1 A is interpreted by 
the presheaf y(A). Such a presheaf is called representable. One can think of E1 A 
as the type of all morphisms of type W > A in D for arbitrary W. Recall from 
above that a morphism of type Y — A in D amounts to a domain-level term of 
type A that may refer to variables in W. In this sense, one should think of E1 A 
as a type of domain-level terms of type A, both closed and open ones. 

We get all morphisms of D, and no more, in this way, since the Yoneda 
embedding is full and faithful, recall Sec. 2. In our case, this means that the type 
(a: El A) — E1 B represents the morphisms of type A > B in D. Any closed term 
of type (a : El A) > El B corresponds to such a morphism and vice versa. This 


is because the naturality requirements in D enforce substitution-invariance, as 
outlined in Sec. 2. The type (x : E1 A) > E1 B thus does not represent arbitrary 
functions from terms of type A to terms of type B, but only substitution-invariant 
ones. If a function of this type maps a domain-level variable x: A (encoded as an 
element of E1 A) to some term M: B (encoded as an element of E1 B), then it 
must map any other N: A to [N/a]M. 

We note that the type dependency in El is easy to work with. A term of 
type (a:0bj) — (b:0bj) > (a:Ela) — Elb corresponds to a family of terms 
(a: E1 A) > El B indexed by objects A and B in D. This is because Obj is just a 
set, so that the naturality constraints of D are vacuous for functions out of Ob ji 

To summarise, we get access to D in the internal type theory of D simply by 
considering the Yoneda embedding as the decoding function El of a universe á la 
Tarski. Since is consists of the representable presheaves, we call it the universe of 
representables. The following lemmas state that the embedding preserves terminal 
object, binary products and the exponential. 


Lemma 1. The internal type theory of D has a term- terminal: El uni t, such 
that x = terminal holds for any x: El unit. 


Lemma 2. The internal type theory of D justifies the terms below, such that 
fst (pair zx y) =x, snd (pair x y) = y, z = pair (fst z) (snd z) for all x,y,z. 
c: Obj, d: Obj fst: (z: El (times c d)) > Elc 
c: Obj, d: Obj H snd: (z : El (times c d)) > Eld 
c: Obj, d: Obj pair: (a: Elec) > (y : Eld) —> El (times c d) 


Lemma 3. The internal type theory of D justifies the terms below such that 
arrow-i (arrow-e f) = f and arrow-e (arrow-i g) =g for all f,g. 

c: Obj, d: Obj | arrow-e: (x: El (arrow c d)) > (y: El c) > Eld 

c: Obj, d: Obj + arrow-i: (y: (Elec > Eld)) > El (arrow c d) 
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4.2 Higher-Order Abstract Syntax 


The last lemma in the previous section states that El A — E1 B is isomorphic 
to El (arrow A B). This is particularly useful to lift HOAS-encodings from D 
to D. For instance, the domain-level term constant lam: (tm —> tm) — tm gives 
rise to an element of El (arrow (arrow tm tm) tm). But this type is isomorphic 
to (El tm — El tm) — El tm, by the lemma. 

This means that the higher-order abstract syntax constants lift to D: 


app: (m:Eltm) > (n:Eltm) > Eltm lam: (m:(Eltm— Eltm)) > Eltm 


Once one recognises El A as y(A), the adequacy of this higher-order abstract 
syntax encoding lifts from D to D as in [12]. For example, an argument M to 
lam has type El tm > El tm, which is isomorphic to El (arrow tm tm). But this 
type represents (open) domain-level terms t: tm > tm. The term lam M: El tm 


then represents the domain-level term lam t: tm, so it just lifts the domain-level. 


4.3 Closed Objects 


One should think of bT as the type of ‘closed’ elements of T. In particular, 
b(El A) represents morphisms of type 1 — A in D, recall the definition of b from 
Sec. 2 and that El A corresponds to yA. In the term model D, the morphisms 
1— A correspond to closed domain-language terms of type A. Thus, while E1 A 
represents both open and closed domain-level terms, )(E1 A) represents only the 
closed ones. 

This applies also to the type E1 A > E1 B. We have seen above that E1 A > 
El B is isomorphic to El(arrow A B) and may therefore be thought of as 
containing the terms of type B with a distinguished variable of type A. But, these 
terms may contain other free domain language variables. The type b(E1 A > E1 B), 
on the other hand, contains only terms of type B that may contain (at most) 
one variable of type A. 

Restricting to closed object with the modality is useful because it disables 
substitution-invariance. For example, the internal type theory for D justifies 
a function is-lam: (#:b(Eltm)) — bool that returns true if and only if the 
argument represents a domain language lambda abstraction. We shall define it 
in the next section. Such a function cannot be defined with type El tm — bool, 
since it would not be invariant under substitution. Its argument ranges over 
terms that may be open; which particularly includes domain-level variables. The 
function would have to return false for them, since a domain-level variable is 
not a lambda-abstraction. But after substituting a lambda-abstraction for the 
variable, it would have to return true, so it could not be substitution-invariant. 

We note that the type Obj consists only of closed elements and that Obj 
and Obj happen to be definitionally equal types (an isomorphism would suffice, 
but equality is more convenient). 
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4.4 Contextual Objects 


Using function types and the modality, it is now possible to work with contextual 
objects that represent domain level terms in a certain context, much like in [20,21]. 
A contextual type [¥ + A] is a boxed function type of the form b(E1 Y —> E1 A). It 
represents domain-level terms of type A with variables from W. Here, we consider 
the domain-level context W as a term that encodes it. The interpretation will 
make this precise. 

For example, domain-level terms with up to two free variables now appear 
as terms of type b(El ((times (times unit tm) tm) > El tm), as the following 
example illustrates. 


box (\u:El((times (times unit tm) tm). let zı = snd (fst u) in 
let z = snd u in 
app (lam (à z:El tm. app z1 £)) £2 ) 


The context variables xı and x2 are bound at the meta level. 

This representation integrates substitution as usual. For example, given crisp 
variables m::El (times c tm) > tm and n::Elc — tm for contextual terms, the 
term box (\u:Elc.m (pair u (n u))) represents substitution of n for the last 
variable in the context of m. 

For working with contextual objects, it is convenient to lift the constants app 
and lam to contextual types. 


c: Obj | app’: b(El c > El tm) > b(Elc > El tm) > b(Elc > tm) 
c:0bj + lam’: b(El (times c tm) > El tm) > b(El c > El tm) 


These terms are defined by: 


app’ := \m,n.let box m’ =m in let box n’ =n in 
box (\u:Elc. app (m u) (n u)) 
lam’ := Am. let box m’ = m in box (Au: Elc.lam (\v:Eltm. m’ (pair u x))) 


A contextual type for domain-level variables (as opposed to arbitrary terms) 
can be defined by restricting the function space in b)(E1W — El A) to consist 
only of projections. Projections are functions of the form snd o fst,, where 
we write fst; for the k-fold iteration fst o---o fst. Let us write S >, T 
for the subtype of S — T consisting only of projections. The contextual type 
b(E1W —, El A) is then a subtype of b(E1W — E1 A). 

With these definitions, we can express a primitive recursion scheme for 
contextual types. We write it in its general form where the result type A can 
possibly depend on x. This is only relevant for the dependently typed case; in 
the simply typed case, the only dependency is on c. 


Lemma 4. Let c: 0bj, 1:)(Elc + El tm) A c type and define: 
Xvar = (c: Obj) > (a: d(Elc >, El tm) > Aca 
Xapp := (c: OBJ) > (£, y:d(Elc > El tm) > Acx>Acy—+Ac (app x y) 
Xian := (c: Obj) + (a: b(El (times c tm) + El tm)) + A (times c tm) x + Ac (lam x) 
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Then, D justifies a term 
H rec: Xur > Xapp > Xian > (c: Obj) > (x:b(Elc > El tm))—> Ace 


such that the following equations are valid. 


rec tuyar tapp tram C T = ttt if x:b(Elc —>, El tm) 
rec tuar tapp tram € (Ope s t) = tap C St 
rec tuar tapp tram € (lam 8) = tient 8 


Proof (outline). To outline the proof idea, note first that a function of type 
(c:0bj) > (a:b(Elc > El tm)) > Acwin D, corresponds to an inhabitant of 
A & t for each concrete object & of D and each inhabitant t: b(E1 ® —> E1 tm). This 
is because naturality constraints for boxed types are vacuous (and Obj = b0Obj). 
Next, note that inhabitants of b(E1® — El tm) correspond to domain-level terms 
of type tm in context ® up to aGn-equality. We can perform a case-distinction on 
whether it is a variable, abstraction or application and depending on the result 
use tvar, tapp OF tiam to define the required inhabitant of A @t. 


As a simple example for rec, we can define the function is-lam discussed 
above by rec (^c, x. false) (Xc, £, Y, Tx, ry. false) (Ac, x, ry. true). 


5 Simple Contextual Modal Type Theory 


We have outlined informally how the internal dependent type theory of D can 
model contextual types. In this section, we make this precise by giving the 
interpretation of Cocon [23], a contextual modal type theory where we can work 
with contextual HOAS trees and computations about them, into D. We will 
focus here on a simply-typed version of Cocon where we use a simply-typed 
domain-language with constants app and lam and also only allow computations 
about HOAS trees, but do not consider, for example, universes. Concentrating 
on a stripped down, simply-typed version of Cocon allows us to focus on the 
essential aspects, namely how to interpret domain-level contexts and domain-level 
contextual objects and types semantically. The generalisation to a dependently 
typed domain-level such as LF in Sec. 6 will be conceptually straightforward, 
although more technical. Handling universes is an orthogonal issue (see also [16]). 

We first define our simply-typed domain-level with the type tm the term 
constants lam and app (see Fig. 1). Following Cocon, we allow computations to 
be embedded into domain-level terms via unboxing. The intuition is that if a 
program t promises to compute a value of type [a:tm, y:tm tm], then we can 
embed ¢ directly into a domain-level object writing lam Ax.lam Ay.app |t] x, 
unboxing t. Domain-level objects (resp. types) can be packaged together with 
their domain-level context to form a contextual object (resp. type). Domain-level 
contexts are formed as usual, but may contain context variables to describe a 
yet unknown prefix. Last, we include domain-level substitutions that allow us to 
move between domain- eval contexts. The compound substitution ø, M extends 
the substitution o with domain W to a substitution with domain wv, „£, where 
M replaces x. Following [18,23], we do not store the domain (like w) in the 
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Domain-level types A,B := tm|A>B 
Domain-level terms M,N := àz.M| MN |z ]|lam | app | lt]o 
Domain-level contexts Vb = |y |P, r:A 
Domain-level context (erased) T, =. |y|f 
Domain-level substitutions o = -|wkg | o, M 
Contextual types E WEA|WEKA 
Contextual objects © wt M 
Domain of discourse T i= | etx 
Types and Terms TI x= [T] | y:na) > 
t,s = y | [C] | rec? BWt | fnyst | ti te 
Branches B = Trt 
Contexts r = -| T,y:ř 


Fig. 1. Syntax of COCON with a fixed simply-typed domain tm 


substitution, it can always be recovered before applying the substitution. We 
also include weakening substitution, written as wkç, to describe the weakening of 


the domain W to Y, TÅ. Weakening substitutions are necessary, as they allow 
us to express the weakening of a context variable ~. Identity is a special form 
of the wks substitution, which follows immediately from the typing rule of wkg. 
Composition is admissible. 

We summarise the typing rules for domain-level terms and types in Fig. 2. 
We also include typing rules for domain-level contexts. Note that since we restrict 
ourselves to a simply-typed domain-level, we simply check that A is a well-formed 
type. We defer the reduction and expansion rules to the appendix and only 
remark here that equality for domain-level terms and substitution is modulo (7. 
In particular, |[@+ N]|, reduces to [o]N. 

In our grammar, we distinguish between the contextual type W + A and 
the more restricted contextual type ® k A which characterises only variables 
of type A from the domain-level context &. We give here two sample typing 
rules for ® h A which are the ones used most in practice to illustrate the 
main idea. We embed contextual objects into computations via the modality. 
Computation-level types include boxed contextual types, [@ + A], and function 
types, written as (y : 71) = T2. We overload the function space and allow as 
domain of discourse both computation-level types and the schema ctx of domain- 
level context, although only in the latter case y can occur in T2. We use fn y > t 
to introduce functions of both kinds. We also overload function application t s 
to eliminate function types (y : T1) > 72 and (y : ctx) => T2, although in the 
latter case s stands for a domain-level context. We separate domain-level contexts 
from contextual objects, as we do not allow functions that return a domain-level 
context. 

The recursor is written as rec? B Y t. Here, t describes a term of type [Y + tm] 
that we recurse over and 6 describes the different branches that we can take 
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I';W+tM:A) Term M has type A in domain-level context Y and context I" 


FEW:ctx wAcwv PEW: ctx PEW: ctx 
P;Wta:Aa ’;WF lam: (tm > tm) > tm I;¥ F app:tm>tm—> tm 


r;YFM:A>B T;FN:A T; x:AFM:B 
T;V+/MN:B T;W'A\r.M:A>B 
Tht: [@F A] PyWba:@ 
T;Wt |t]: A 


I';t+o:W}| Substitution ø provides a mapping from the (domain) context V to ® 


— 
PRY wA: ctx ë PEp:cts Pjobo:0 T;ðHM:A 
1:0, TÅ wkp: Y T;PE-:- T;Bbo,M:W,a:A 


I HF W: ctx} Domain-level context W is a well-formed 


Py)=ctx PEW: ctx 
Cre ctx Prynetx PFU eA: ctx 


Fig. 2. Typing Rules for Domain-level Terms, Substitutions, Contexts 


depending on the value computed by t. As is common when we have dependencies, 
we annotate the recursor with the typing invariant Z. Here, we consider only 
the recursor over domain-level terms of type tm. Hence, we annotate it with 
T = (4 : ctx) > (y : [y F tm]) => 7. To check that the recursor rec? B W t has 
type [W/w]r, we check that each of the three branches has the specified type Z. 
In the base case, we may assume in addition to w : ctx that we have a variable 
p: [Wk tm] and check that the body has the appropriate type. If we encounter 
a contextual object built with the domain-level constant app, then we choose 
the branch bapp. We assume w: ctx, m: [Y H tm], n: [Y F tm], as well as fn and 
fm which stand for the recursive calls on m and n respectively. We then check 
that the body tapp is well-typed. If we encounter a domain object built with the 
domain-level constant lam, then we choose the branch bjam. We assume y: ctx 
and m:[w,a:tm tm] together with the recursive call fm on m in the extended 
LF context p, x:tm. We then check that the body tiam is well-typed. The typing 
rules for computations are given in Fig. 3. We omit the reduction rules here and 
refer the interested reader to the appendix. 


5.1 Interpretation 


We now give an interpretation of simply-typed Cocon in a presheaf model with 
a cartesian closed universe of representables. Let us first extend the internal 
dependent type theory with the constant tm for modelling the domain-level 
type constant tm and with the constants app: Eltm —> Eltm — Eltm and 
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I''C:T| Contextual object C has contextual type T 


T;V-/M:A TEKW:ctx tAcw elk A] Er F;Wrwks:6 
rH (HM): (WHA) PE (E 2): (Wb A) DE (GE |z]wks): (Yb A) 
y:řeT PRCT 


I’ t:7] Term t has computation type T 


Try:f TE [C]:[T] 


PEt: (y:%) tm Fs: DTyy:tbt:m% IF (y:nm)>7: type 
Tts:[s/y|t2 TrinySt:(y:t1) >t 


Recursor over domain-level terms Z = (4 : ctx) > (y: [Y F tm]) > T 
Prt: [WFtm] TFT Pea: Z Pe bap: ZL Eb diam: ZL 
T H rec? (by | bapp | biam) Y t: [W/v]r 


Dap ecops [ok tm] m tT 
TE (Wb, p> to):T 
Tw: ctx,m:[w tm], n:[wk tm], frit, frit - tapp 7 
TF (p,m, n, fn, frm +> tap): T 


T, ¢ : ctx, m:[¢, x:tm F tm], fm:[(d, x:tm)/Y]T F tiam : [b/d] 
PEw,m, fm tiam: T 


Branch for Variable 


Branch for Application app 


Branch for Function lam 


Fig. 3. Typing Rules for Contextual Objects and Computations 


lam: (El tm — El tm) — El tm to model the corresponding domain-level constants 
app and lam. 

We can now translate domain-level and computation-level types of Cocon into 
the internal dependent type theory for D. We do so by interpreting the domain- 
level terms, types, substitutions, and contexts (see Fig. 4). All translations are on 
well-typed terms and types. Domain-level types are interpreted as the terms of 
type Obj in the internal dependent type theory that represent them. Domain-level 
contexts are also interpreted as terms of type Obj by |I + ¥ : ctx]. For example, 
a domain-level context x:tm, y:tm is interpreted as times (times unit tm) tm: 
Obj. A domain-level substitution with domain W and codomain @ becomes 
a term of type Ele’ that is parameterised by an element u:Ele, where e = 
[I F ®: ctx] and e’ = [I F © : ctx]. As e’ is some product, for example 
times (times unit tm) tm, the domain-level substitution is translated into an 
n-ary tuple. A weakening substitution [°;W,a:tm + wky : W is interpreted as 
fst u where u: El (times e tm) and e = |I" F ©: ctx]. More generally, when we 


weaken a context W by n declarations, i.e. x:A, we interpret wky as fst, u. 

A well-typed domain-level term, T; W F M : A, is mapped to an object of 
type El | A] that depends on u:E1 [I F ©: ctx]. 

Hence the translation of a well-typed domain-level term is indexed by u that 
stands for the term-level interpretation of a domain-level context ®. Initially, u 
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Interpretation of domain-level types 


tm = tm 

A-> B] = arrow [A] [B] 

Interpretation of domain-level contexts 

T Fp: ctx] =v 

[TE -: ctx] = unit 

I H (W,ax:A) : ctx] = times e [A] where |I F W: ctx] =e 


Interpretation of domain-level terms where u:Ele and [I F © : ctx] =e 
T;Wt a: Aju = snd (fst u) where W = W,@:A, Yk:Åk, . .., Y1:A1 
[T;Wt \x.M:A—- Blu = arrow-i (\v:E1 [A]. e) 


where [I’;W,x2:Al M : B] pair u z) = e 
T;U> MN: Blu = arrow-e €1 ez where |I; Y F M : A > Blu =e1 
and [F; VFN: Alu = e2 
ao aa A = let box x = e; in x eg where |I F t: [SF A]] = e1 
and [[;¥t oa: 4], = e2 


[;Wt app: tm > tm > tm], = arrow-i(X\w:El tm. arrow-i (\y:El tm. app x y)) 
DT; ¥W H lam: (tm > tm) > tm], = arrow-i(\ f:El (arrow tm tm). 
lam (\v:El tm. arrow-e f x)) 


Interpretation of domain-level substitutions where u:El e and [I F @: ctx] =e 
T;WE-: Ju = terminal 


I; Y H (o,M):@,2:A]u = pair e1 e2 where |I; Y F o : &], = e1 
and |I; ¥ F M : Aju = e2 
r;U,0Abwkz: J, = fstn u where n = |T:À| 


Fig. 4. Interpretation of Domain-level Types and Terms 


is simply a variable. However, when we translate l; F Az.M : A > B given 
u:Ele where |I + W : ctx] = e, we need to recursively translate M in the 
extended domain-level context W, x:A and hence we also need to build a term 
pair u x that inhabits El (times e [A]). The translation of T; 8, x:AF M : A 
will return a term e that may contain x. However, note that x will eventually 
be bound in arrow-i (\a:E1 [A]. e) When we translate a variable x where ® = 
Bo, x:A, ye: Ak,---,y1:A1, we return fst, (snd u). We translate T; PF |t]o : A 
directly using let box-construct where the domain-level substitution o is simply 
translated into a pair. As the computation t has the contextual type [Y tm] 
its translation will be of type )(Ele > Eltm) where e = |I F Y : ctx]. Hence 
we simply can extract a function x:(Ele — Eltm) using let box construct and 
pass to it the interpretation of o. The translation of domain-level applications 
and domain-level constants app and lam is straightforward. 

The interpretation of a contextual types [¥ + A] makes explicit the fact that 
they correspond to functions El e + E1 [A] where e = [I ©: ctx] (see Fig. 5). 
Consequently, the corresponding contextual object (8 + M) is interpreted as a 
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Interpretation of contextual objects (C) 


T+ (F M): (Gb A)] = >u:Ele. e where [I + 8 : ctx] =e 
and [[;®+ M: Aju =e’ 

TH (GEM): (6k A) = Nu:Ele.e’ where [I + £ : ctx] =e 
and [[;+ M: Alu =e’ 

Interpretation of contextual types (T) 

Tt (®t A) = (u:Ele) > E1 [A] where |I F @: ctx] =e 

I' (@k& A)] = (u:Ele) >, El [A] where [I F @: ctx] =e 


Fig. 5. Interpretation of Contextual Objects and Types 


Interpretation of computation-level types (7) 


[Tl] = oT] 

(a:71) > 72] = (x:[71]) > [r] 

ctx] = Obj 
Computation-level typing contexts (T) 

T, «:7] = [T], x: [7] 
Interpretation of computations (I H t : 7; without recursor) 
PE [C):[T]] = box e where [TAF C: T] =e 
TF ty te: 7] = €] €2 where |I F ti : (a@:72) > 7] =e1 
and [I F te H 72] =e2 

Tring >t: (a7) > rr] = x: [A] e where [I 2:7, F t:m] =e 
Tear] =i 


Fig. 6. Interpretation of Computation-level Types and Terms — without recursor 


function. Similarly, [Yk A] is mapped to the restricted function space denoted 
by —y, which describes functions with bodies that only contain projections. 

Last, we give the interpretation of computation-level types, contexts and 
terms (see Fig. 6). It is mostly straightforward, as we simply map [T] to b[T] 
and [C] is simply interpreted as boxed term. 

The interpretation of the recursor is also straightforward now (see Fig. 7). In 
Lemma 4, we expressed a primitive recursion scheme in our internal type theory 
and defined a term rec together with its type. We now interpret every branch of 
our recursor in the computation-level as a function of the required type in our 
internal type theory. While this is somewhat tedious, it is straightforward. 

We can now show that all well-typed domain-level and computation-level 
objects are translated into well-typed constructions in our internal type theory. 
As a consequence, we can show that equality in Cocon is equivalent to the 
corresponding equivalence in our internal type theoretic interpretation. 
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Interpretation of recursor for T = (w: ctx) > (y : [Y F tm]) > 7: 
[Pb rec? (b, | bapp | biam) Y t: [W/w, t/y]r] = rec ev eap eran €c e 


where |I F by : Z] = ev, [It 


bapp : Z] = eapp, [I F biam : Z] = eran, 


[PF Y : ctx] =e. and [I F t: [Wt tml] =e 


Interpretation of Variable Branch 


IDE @, a2 ty): Z] 


= Aw: Obj. \ a: b(EL p >, Eltm).e 


where |T, 4 : ctx, x : [Wh tm] F te : [x/y]7] =e 


Interpretation of Application Branch 
[LE (Y, m,n, fn, fm tapp) : Z] = \w: Obj. \ m, n:b(E1 Y > El tm). 


where |T, y:ctx, m:[Y F tm], n 


> fm: [lm/ylr]. > fr: lin/ylr]. e 


[pF tm] F tapp : [fF app [m] [n]]/yl7] = e 


Interpretation of Lambda-Abstraction Branch 


[L F (Y, m, fm +4 tiam) : Z] 


= Ay: Obj. \ m:b(El (times Y% tm) > El tm). 
fmiTm-e 


where [[(w, x:tm)/w, m/y|T] = Tm, 
LP, wictx, m:[Y, atm tm] F tapp : [[Y F lam Ax.|m]|]/y]7] =e 


Fig. 7. Interpretation of Recursor 


Lemma 5. The interpretation maintains the following typing invariants: 


— IfI H Y: ctz then [I A Y: cte]: Obj. 

— fT; VA M: A then [I], u El F Y: cte] H [3+ M: Alu: El [A]. 
— IfI; V Ho: Y then |I], w EL| Fv: cta] IL; YF o: Yu: Ev]. 
— [fl C:T then |T] A [rA C: TI: [T]. 

— fIr t: 7 then [I] AF [IF t: r]: fr]. 


The proof goes by induction on derivations. 


Proposition 1 (Soundness). The following are true. 


— fT; UA- M=N: A then 


I], u: Ei [Y] ei; e+ M: Al, 5 M; YF N: Alu: ELIA]. 


— fT; F o=: ® then 


[I], u: EL [Y] A IL; E o: Bju = [L; Y F o : B]u: El [E]. 
— FIr t =t: T then [T] F [LF ti: 7] = [LF te: r): [r]. 


6 Presheaves on a Small Category with Attributes 


To explain the core of our approach as simply as possible, we have concentrated 
on a simply-typed domain language. In the remaining space, we outline how our 
approach generalises to dependent domain languages like LF. 

We follow the same approach as above. We start from a term model D of the 
domain language and then interpret contextual types in the presheaf category D. 


In the simply-typed case above, 


was a small cartesian closed category. In the 


518 B. Pientka and U. Schépp 


dependent case, D is a small Category with Attributes. Categories with attributes 
(CwAs) [11] are a general notion of model for dependent type theories that is 
suitable for modelling dependent domain languages like LF. 

With this change, we follow essentially the same approach as above. The 
main difference is that the universe of representables now makes available the 
CwA-structure of D instead of the cartesian closed structure. The following 
section outlines this in analogy to Sec. 4.1. 


6.1 Yoneda CwA 


In a Yoneda CwA we again have a type for the objects of D, which we now denote 
Ctx. In the term model for LF, these would be the LF contexts. The type Ty c 
represents (possibly dependent) LF types in context c. Contexts can be built 
with the constants nil and cons. 


k Ctx type F nil: Ctx 
c: Ctx F Ty c type + cons: (c: Ctx) > (a: Ty c) > Ctx 


Both Ctx and Ty c are constant presheaves, i.e. bCtx = Ctx and b(Ty c) = Ty c. 
As in Sec. 4.1, we consider the contexts as codes of a universe. 


c: Ctx F El c type 


The type Elc has the same interpretation as before and is essentially just the 
Yoneda embedding. The morphisms c —> d of the CwA D thus appear as functions 
of type El c > Eld. 

The axioms of a CwA can be stated using terms and equations in the inter- 
nal language of D. For example, substitution on types and context projection 
morphisms are given by the following constants. 


c, d:Ctx F sub: (a: Ty d) > (f:Elc > Eld) > Ty c 
c: Ctx, a: Ty cF p: El (cons c a) > Elc 


The other components of a CwA are added similarly and the CwA-axioms [11] 
are expressed in terms of equations for these constants. 
The inhabitants of a type can then be captured by the dependent type 


c: Ctx, a: Ty c, u:ElcF I a u type 


defined by Iau := Xv:El (cons c a). (p v) = u. This type contains all values in 
El (cons c a) whose first projection is u. If one considers u: El c as a dependent 
tuple of LF terms (one term for each variable in the context represented by c), 
then I a u represents all the terms that can be appended to this tuple to make 
it into one of type El (cons c a). Indeed, one can define a pairing operation by 
pair := Xu. Alv, p). v. 


c: Ctx, a: (Ty c) F pair: (u:Elc) > I a u > El (cons c a) 
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With these definitions, we can represent dependent contextual types much like 
the simply-typed ones. Recall that we had interpreted ®F A by E1 [|] —> E1 [A] 
where both [S] and [A] were terms of type Obj. In the dependent case, A may 
depend on ®. The interpretation of ® is a term [P]: Ctx, much as before. The 
interpretation of A takes the dependency into account: u: El [8] H [A]u: Ty u. 
The interpretation of the contextual type P H A will then be: 


(u: El [@]) > I [A]u u 


It may be interesting to note that (u: El c) — I a u is isomorphic to the type of 
sections of p: El (cons c a) > El c. 

Object-level term constants in LF can be lifted using I. Consider, for example, 
an encoding of the simply-typed lambda-calculus in LF. It represents only well- 
typed terms by means of the constants app: Ha, b: ty. tm (arr a b) > tm a > tm b 
and lam: a,b: ty. (tm a + tm b) > tm (arr a b). Therein, the type tm of object- 
level terms is dependent on an object-level type ty, which may be built using a 
constant o: ty for a base type and a constant arr: ty > ty —> ty for function 
types. This encoding lifts to the Yoneda CwA as in simply-typed case: 


c: Ctx F ty: Tyc [ro:Ityu 
c:Ctx F tm: Ty (cons c ty) Trarr:Ityu>Ityu>lItyu 

At app: I tm (pair u (arr a b)) > I tm (pair u a) > I tm (pair u b) 
- lam: (I tm (pair u a) > I tm (pair u b)) > I tm (pair u (arr a b)) 


Here, I’ abbreviates c: Ctx, u: (El c) and A abbreviates I, a,b: (I ty u). Notice 
how lam uses higher-order abstract syntax at the meta level. 

With these definitions, the interpretation of Cocon is essentially just as before. 
For working with the dependencies in a Yoneda CwA, we found it very useful to 
type-check our definitions in Agda, see our sources [1]. 


7 Conclusion 


We have given a rational reconstruction of contextual type theory in presheaf 
models of higher-order abstract syntax. This provides a semantical way of un- 
derstanding the invariants of contextual types independently of the algorithmic 
details of type checking. At the same time, we identify the contextual modal 
type theory, Cocon, which is known to be normalising, as a syntax for presheaf 
models of HOAS. By accounting for the Yoneda embedding with a universe 4 
la Tarski, we obtain a manageable way of constructing contextual types in the 
model, especially in the dependent case. While various forms of universes are 
being studied in the context of functor categories, e.g. [2,16], we are not aware of 
previous uses of presheaves over CwAs or similar. 

In future work, one may consider using the model as a way of compiling 
contextual types, by implementing the semantics. In another direction, it may be 
interesting to apply the syntax of contextual types to other presheaf categories. 
We also hope that the model will help to guide the further development of Cocon. 


Acknowledgements. We thank the anonymous reviewers for helpful feedback. 
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Abstract. Probabilistic Büchi automata are a natural generalization 
of PFA to infinite words, but have been studied in-depth only rather 
recently and many interesting questions are still open. PBA are known 
to accept, in general, a class of languages that goes beyond the regular 
languages. In this work we extend the known classes of restricted PBA 
which are still regular, strongly relying on notions concerning ambiguity 
in classical w-automata. Furthermore, we investigate the expressivity of 
the not yet considered but natural class of weak PBA, and we also show 
that the regularity problem for weak PBA is undecidable. 


Keywords: probabilistic - Biichi - automata - ambiguity - weak 


1 Introduction 


Probabilistic finite automata (PFA) are defined similarly to nondeterministic 
finite automata (NFA) with the difference that each transition is equipped with 
a probability (a value between 0 and 1), such that for each pair of state and 
letter, the probabilities of the corresponding outgoing transitions sum up to 1. 
PFA have been investigated already in the 1960ies in the seminal paper of Rabin 
[18]. But while the development of the theory of automata on infinite words also 
started around the same time [7], the model of probabilistic automata on infinite 
words has first been studied systematically in [3]. The central model in this 
theory is the one of probabilistic Biichi automata (PBA), which are syntactically 
the same as PFA. The acceptance condition for runs is defined as for standard 
nondeterministic Biichi automata (NBA): a run on an infinite word is accepting 
if it visits an accepting state infinitely often (see [23,24] for an introduction to 
the theory of automata on infinite words). In general, for probabilistic automata 
one distinguishes different criteria of when a word is accepted. In the positive 
semantics, it is required that the probability of the set of accepting runs is greater 
than 0, in the almost-sure semantics it has to be 1, and in the threshold semantics 
it has to be greater than a given value \ between 0 and 1. It is easy to see that 
PFA with positive or almost-sure semantics can only accept regular languages, 
because these conditions correspond to the fact that there is an accepting run or 
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that all runs are accepting. For infinite words the situation is different, because 
single runs on infinite words can have probability 0. Therefore, the existence of 
an accepting run is not the same as the set of accepting runs having probability 
greater than 0 (similarly, almost-sure semantics is not equivalent to all runs 
being accepting). And in fact, it turns out that PBA with positive (or almost- 
sure) semantics can accept non-regular languages [3]. This naturally raises the 
question under which conditions a PBA accepts a regular language. 


In [3] a subclass of PBA that accept only regular languages (under positive 
semantics) is introduced, called uniform PBA. The definition uses a semantic 
condition on the acceptance probabilities in end components of the PBA. A 
syntactic class of PBA that accepts only regular languages (under positive and 
almost-sure semantics) are the hierarchical PBA (HPBA) introduced in [8]. The 
state space of HPBA is partitioned into a sequence of layers such that for each 
pair of state and letter there is at most one transition that does not increase the 
layer. Decidability and expressiveness questions for HPBA have been studied in 
more detail in [11,10]. While HPBA accept only regular languages for positive 
and almost-sure semantics, it is not very hard to come up with HPBA that 
accept non-regular languages under the threshold semantics [8,11] (see also the 
example in Figure 2(a) on page 10). Restricting HPBA further such that there 
are only two layers and all accepting states are on the first layer leads to a class 
of PBA (called simple PBA, SPBA) that accept only regular languages even 
under threshold semantics [9]. 


In this paper, we are also interested in the question under which condi- 
tions PBA accept only regular languages. We identify syntactical patterns in 
the transition structure of PBA whose absence guarantees regularity of the ac- 
cepted language. These patterns have been used before for the classification of 
the degree of ambiguity of NFA and NBA [25,19,16]. The degree of ambiguity of 
a nondeterministic automaton corresponds to the maximal number of accepting 
runs that a single input word can have. For NBA, the ambiguity can (roughly) 
be uncountable, countable, or finite. For positive semantics, we show that PBA 
whose transition structure corresponds to at most countably ambiguous NBA, 
accept only regular languages. For almost-sure semantics, we need a slightly 
stronger condition for ensuring regularity. But both classes that we identify are 
easily seen to strictly subsume the class of HPBA. For the emptiness and uni- 
versality problems for these classes we obtain the same complexities as the ones 
for HPBA. In the case of threshold semantics, we show that finite ambiguity 
is a sufficient condition for regularity of the accepted language, generalizing a 
corresponding result for PFA from [12]. The class of finitely ambiguous PBA 
strictly subsumes the class of SPBA. 


Besides the relation between regularity and ambiguity in PBA, we also inves- 
tigate the class of weak PBA (abbreviated PWA). In weak Biichi automata, the 
set of accepting states is a union of strongly connected components of the au- 
tomaton. We show that PWA with almost-sure semantics define the same class 
of languages as PBA with almost-sure semantics (which implies that with posi- 
tive semantics PWA define the same class as probabilistic co-Biichi automata). 


524 C. Léding and A. Pirogov 


This is in correspondence to results for non-probabilistic automata: weak au- 
tomata with universal semantics (a word is accepted if all runs are accepting) 
define the same class as Biichi automata with universal semantics, and nonde- 
terministic weak automata correspond to nondeterministic co-Biichi automata 
(see, e.g., [17], where weak automata are called weak parity automata). Further- 
more, it is known that universal Biichi automata, respectively nondeterministic 
co-Biichi automata, can be transformed into equivalent deterministic automata 
(with the same acceptance condition). An analogue of deterministic automata 
in the probabilistic setting are the so-called 0/1 automata, in which each word 
is either accepted with probability 0 or with probability 1. It is known that 
almost-sure PBA can be transformed into equivalent 0/1 PBA (see the proof of 
Theorem 4.13 in [4]). Concerning weak automata, a language can be accepted 
by a deterministic weak automaton (DWA) if, and only if, it can be accepted by 
a deterministic Büchi and by a deterministic co-Biichi automaton (this follows 
from results in [14], see [6] for a more direct construction). We show an analogous 
result in the probabilistic setting: The class of languages defined by 0/1 PWA 
corresponds to the intersection of the two classes defined by PWA with almost- 
sure semantics and with positive semantics, respectively. It turns out that this 
class contains only regular languages, that is, 0/1 PWA define the same class as 
DWA. 

We also show that the regularity problem for PBA is undecidable (the prob- 
lem of deciding for a given PBA whether its language is regular). For PBA 
with positive semantics this is not surprising, as for those already the emptiness 
problem is undecidable [4]. However, for PBA with almost-sure semantics the 
emptiness and universality problems are decidable [1,2,8]. We show that regular- 
ity is undecidable already for PWA with almost-sure or with positive semantics. 
The proof also yields that it is undecidable for a fixed regular language whether 
a given PWA accepts this language. 

This work is organized as follows. After introducing basic notations in Sec- 
tion 2 we first characterize various regular subclasses of PBA that we derive 
from ambiguity patterns in Section 3 and then we derive some related complex- 
ity results in Section 4. In Section 5 we present our results concerning weak 
probabilistic automata and in Section 6 we conclude. 


2 Preliminaries 


First we briefly review some basic definitions. 

If X is a finite alphabet, then X* is the set of all finite and X™ is the set of 
all infinite words w = wow,... with w; € X. For a word w we denote by w(i) 
the i-th symbol wi. 

Classical automata used in this work have usually the shape (Q, X, A, Qo, F), 
where Q is a finite set of states, X a finite alphabet, A C Q x X x Q is the tran- 
sition relation and Qo, F C Q are the sets of initial and final states, respectively. 

We write A(p,a) := {q € Q | (p,a,q) E€ A} to denote the set of successors 
of p E€ Q on symbol a € YX, and A(P,w) for P C Q,w € X* with the usual 
meaning, i.e., states reachable on word w from any state in P. 
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A run of an automaton on a word w € X® is an infinite sequence of states 
qo, Qi,--- starting in some go € Qo such that (qi, w(i), qi+1) € A for all i > 0. 
We say that a set of runs is separated (at time i) when the prefixes of length i 
of those runs are pairwise different. 

As usual, an automaton is deterministic if |Qo| = 1 and |A(p,a)| < 1 for all 
p€Q,a € X, and nondeterministic otherwise. For deterministic automata we 
may use a transition function ô: Q x X > Q instead of a relation. 

Probabilistic automata we consider have the shape (Q, ’, 6, po, F), i.e., the 
transition relation is replaced by a function ô : Q x X x Q —> [0,1] which for 
each state and symbol assigns a probability distribution on successor states (i.e. 
icq (p,a, 9) = 1 for all p E€ Q,a € X), and po : Q > [0,1] with $ -cgo uola) = 
1 is the initial probability distribution on states. The support of a distribution 
u is the set supp(u) := {x | u(x) > O}. Similarly as above, we may write d(j1, w) 
and mean the resulting probability distribution after reading w € X*, when 
starting with probability distribution p. 

For a probabilistic automaton A the underlying automaton A< is given by 
recovering the transition relation A := {(p,x,q) | 6(p,x,q) > 0} of positively 
reachable states and the initial state set Qo := supp(uo). 

As usual, a run of an automaton for finite words is accepting if it ends in 
a final state. For automata on infinite words, run acceptance is determined by 
the Biichi (run visits infinitely many final states) or Co-Biichi (run visits finitely 
many final states) conditions. 


We write p ~ q if there exists a path from p to q labelled by x € Xt and 
p —> q if there exists some x such that p > q. The strongly connected component 
(SCC) of p E€ Q is scc(p) := {q E€ Q | p = qor p > qand q — p}. The set 
SCCs(A) := {scc(q) | q € Q} is the set of all SCCs and partitions Q. An SCC 
is accepting (rejecting) if all (no) runs that stay there forever are accepting. 
An SCC is useless if no accepting run can continue from there. An automaton 
is weak, if the set of final states is a union of its SCCs. In this case, Büchi 
and Co-Biichi acceptance are equivalent and we treat weak automata as Büchi 
automata. 

A classical automaton is trim if it has no useless SCCs, whereas a probabilistic 
automaton is trim if it has at most one useless SCC, which is a rejecting sink 
that we canonically call dre; We assume w.l.o.g. that all considered automata 
are trim, which also means that in an underlying automaton the sink qrej is 
removed. 

We call transitions of probabilistic automata that have probability 1 deter- 
ministic and otherwise branching. If there are transitions p > q and p $ q' with 
q £ q', we call this pattern a fork. Every branching transition clearly has at least 
one fork. We call a (p,q, q') fork intra-SCC, if p,q,q' are all in the same SCC, 
otherwise it is an inter-SCC fork. A run of an automaton is deterministic if it 
never goes through forks, and limit-deterministic if it goes only through finitely 
many forks. We say that two deterministic runs merge when they reach the same 
state simultaneously. For a finite run prefix p, we call all valid runs with this 
prefix continuations of p. 
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A classical automaton A accepts w € X® if there exists an accepting run on 
w, and the language L(A) recognized by A is the set of all accepted words. If 
P is a set of states of an automaton, we write L(P) for the language accepted 
by this automaton with initial state set P. For sets consisting of one state q, we 
write L(q) instead of L({q}). 

For a probabilistic automaton A and an input word w (finite or infinite), 
the transition structure of A induces a probability space on the set of runs of 
A on w in the usual way. We do not provide the details here but rather refer 
the reader not familiar with these concepts to [4]. In general, we write Pr(E) for 
the probability of a measurable event E in a probability space. For probabilistic 
automata, we consider positive, almost-sure and threshold semantics, i.e., an 
automaton accepts w if the probability of the set of accepting runs on w is > 0, 
=1 or >A (for some fixed A €]0,1[), respectively. For an automaton A these 
languages are denoted by L7°(.A), L=1(.A) and L>(A), respectively, whereas 
L(A) := L(A%) is the language of the underlying automaton. A probabilistic 
automaton is 0/1 if all words are accepted with either probability 0 or 1 (in this 
case the languages with the different probabilistic semantics coincide). 

To denote the type of an automaton, we use abbreviations of the form XYA 0? 
where the type of transition structure is denoted by X € { D (det.), N (nondet.), 
P (prob.) }, the acceptance condition is specified by Y € { F (finite word), 
B (Biichi), C (Co-Biichi), W (Weak) }, and for probabilistic transitions the 
semantics for acceptance is given by y € {>0,=1,>A, 0/1}. 

By L™(XYA) we denote the whole class of languages accepted by the cor- 
responding type of automaton. If L is a set of languages, then L denotes the 
set of all complement languages (similarly, for a language L, we denote by L its 
complement), and BCI(L) the set of all finite boolean combinations of languages 
in L. We use the notion of regular language for finite words and for infinite words 
(the type of words is always clear from the context). 


3 Ambiguity of PBA 


Ambiguity of automata refers to the number of different accepting runs on a 
word or on all words. An automaton is finitely ambiguous (on w) if there are 
at most k different accepting runs (on w) for some fixed k € N, and in case of 
at most one accepting run it is called unambiguous. If on each word there are 
only finitely many accepting runs, but no constant upper bound over all words, 
then it is polynomially ambiguous if the number of different run prefixes that 
are possible for any word prefix of length n can be bounded by a polynomial in 
n, and otherwise exponentially ambiguous. Finally, if if there exist words that 
have infinitely many runs, but no word on which there are uncountably many 
accepting runs, then it is countably ambiguous, and otherwise it is uncountably 
ambiguous. 

In [16] (see also [19]), a syntactic characterization of those classes is presented 
for NBA by simple patterns of states and transitions. We define those patterns 
here and refer to [16] for further details. An automaton A has an IDA pattern 
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if there exist two states p Æ q and a word v € X* such that p > p, p > q 
and q Š q. If additionally q € F, then this is also an IDA p pattern. Finally, 
A has an EDA pattern if there exists a state p and v € X* such that there 
are two different paths p > p, and if additionally p € F, this is also an EDA p 
pattern. If a PBA has no EDA pattern, we call it flat, reflecting the naming 
of a similar concept in other kinds of transition systems (e.g. [15]). The names 
IDA and EDA abbreviate “infinite/exponential degree of ambiguity”, which they 
indicated in the original NFA setting, and we keep those names for consistency. 


By k-NBA, n®-NBA, 2”-NBA, Xo-NBA we denote the subsets of at most 
finitely, polynomially, exponentially and countably ambiguous NBA (and sim- 
ilarly for other types of automata). When speaking about ambiguity of some 
PBA A, we mean the ambiguity of the trimmed underlying NBA A*. 


In [8], hierarchical PBA (HPBA) were identified as a syntactic restriction 
on PBA which ensures regularity under positive and almost-sure semantics. A 
PBA with a unique initial state is hierarchical, if it admits a ranking on the 
states such that at most one successor on a symbol has the same rank, and no 
successor has a smaller rank. A HPBA has & levels if it can be ranked with only 
k different values. Simple PBA (SPBA) were introduced in [9] and are restricted 
HPBA with two levels such that all accepting states are on level 0. 


L*°(No-PBA) 
regular 


L=! (flat PBA U 2*-PBA) 
regular 


~EDA fF 
countably amb. 


L**(k-PBA) 
regular 


-namb > 
, i 


Pa 


/espBa. i 


Fig. 1: Illustration of the automata classes with restricted ambiguity as presented 
for NBA in [16], which are characterized by the absence of the state patterns 
IDA, IDA p, EDA, and EDA p and their relation to the restricted classes called 
“Hierarchical PBA” (HPBA) [8] and “Simple PBA” (SPBA) [9]. We identify classes 
in this hierarchy which can be seen as extensions “in spirit” of respectively SPBA 
and HPBA, subsuming them while also preserving their good properties, as e.g. 
definition by syntactic means, regularity under different semantics and several 
complexity results. 
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First, we show how HPBA relate to the ambiguity hierarchy, which can eas- 
ily be derived by inspection of the definitions. A visual illustration is given in 
Figure 1. 


Proposition 1 (Relation of HPBA and the ambiguity hierarchy). 


1. HPBA C flat PBA C Xo-PBA. 
2. k-PBA Z HPBA and HPBA Ç k-PBA. 
3. SPBA C unambiguous PBA C k-PBA. 


Starting from these observations, this work was motivated by the question 
whether the ambiguity restrictions, which were only implicit in HPBA and 
SPBA, can be used explicitly to get larger classes with good properties. In the 
following we will positively answer this question. 


3.1 From classical to probabilistic automata 


First, we observe that probabilistic automata can recognize regular languages 
even under severe ambiguity restrictions. 


Proposition 2. Let A be a DBA. Then there exists an unambiguous PBA B 
such that L>}? (B) = L=1(B) = L(A). 


Proof. As A is a (w.l.o.g. complete) DBA, there exists exactly one run on each 
word and all transitions when seen as PBA must have probability 1. Clearly this 
unique natural 0/1 PBA obtained from A accepts the same language under both 
probable and almost-sure semantics and it is trivially unambiguous. 


Limit-deterministic NBA (LDBA) are NBA which are deterministic in all 
non-rejecting SCCs. The natural mapping of LDBA into PBA [4, Lemma 4.2] 
already trivially yields countably ambiguous automata (because the determinis- 
tic part of the LDBA cannot contain an EDA p pattern, which implies uncount- 
able ambiguity [16]). The following result shows that already unambiguous PBA 
under positive semantics suffice for all regular languages. 


Theorem 1. Let LC &” be a regular language. 
Then there exists an unambiguous PBA B such that L~°(B) = L. 


Proof (sketch). Let A = (Q, X,ô,qo,c) be a deterministic parity automaton 
accepting L, i.e., a finite automaton with priority function c: Q > {1,...,m} 
such that w € L(A) iff the smallest priority assigned to a state on the unique 
run of A on w which is seen infinitely often is even. 

We construct an unambiguous LDBA for L, which then easily yields a PBA>? 
by assigning arbitrary probabilities ([4, Lemma 4.2]) without influencing the 
ambiguity. If the parity automaton A has m priorities, the LDBA B can be 
obtained by taking m+1 copies, where m of them are responsible for one priority 
each, and one is modified to guess which priority į on the input word is the most 
important one appearing infinitely often along the run of A, and correspondingly 
switch into the correct copy. This switching is done unambiguously for the first 
position after which no priority more important than i appears. 
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3.2 From probabilistic to classical automata 


First we establish a result for flat PBA, i.e. PBA that have no EDA pattern. 
In automata without EDA pattern there are no states which are part of two 
different cycles labeled by the same finite word. Even though we defined flat 
PBA by using an ambiguity pattern, the set of flat PBA does not correspond to 
an ambiguity class, but it is useful for our purposes due to the following property: 


Lemma 1. If A is a flat PBA and w € X”, then the probability of a run of A 
on w to be limit-deterministic is 1. 


Proof. Let Runs( A, w) denote the set of all runs of A on w and nldRuns(A, w) 
denote the subset containing all such runs that are not limit-deterministic. As 
A is flat, it has no EDA and thus also no EDA p pattern, hence A is at most 
countably ambiguous (by [16]). Moreover, there are not only at most countably 
many accepting runs on any word, but also countably many rejecting runs (which 
can be seen by a simple generalization of [16, Lemma 4]). But as all runs are 
disjoint events, each run p that uses infinitely many forks has probability 0, and 
the total number of runs is countable, we can see that 


Pr(Runs(.A, w) \ nldRuns(.A, w)) = 5 Pr(p)  — 5 Pr(p) =1-0=1. 


peRuns(A,w) p€nldRuns(.A,w) 


The following lemma characterizes acceptance of PBA under extremal se- 
mantics with restricted ambiguity and is crucial for the constructions in the 
following sections: 


Lemma 2 (Characterizations for extremal semantics). 
Let A be a PBA. 


1. If A is at most countably ambiguous, then 

w E€ L7°(.A) & there exists an accepting run on w that is limit-deterministic. 
2. If there are finitely many accepting runs of A on w, then 

w € L=1(A) & all runs on w are accepting and limit-deterministic. 
3. If A is flat, then 


w € L=1(A) © there is no limit-deterministic rejecting run on w. 


Proof. (1.) : For contradiction, assume that every accepting run on w goes 
through forks infinitely often. But then the probability of every individual ac- 
cepting run on w is 0. Each run is a measurable event (it is a countable intersec- 
tion of finite prefixes) and clearly disjoint from other runs, as two different runs 
must eventually differ after a finite prefix. But as the number of accepting runs 
is countable by assumption, by o-additivity it follows that the probability of all 
accepting runs is also 0, contradicting the fact that w € L7°(A). 

For the other direction, pick a limit-deterministic accepting run p of A on w 
and let uv = w and q € Q such that the state of p after reading u is q and there 
are no forks visited on v. Clearly, the probability to be in q after u in a run of A 
is positive (because u is finite), and the probability that A continues like p from 
q on v is 1. Hence, the probability of p is positive. 
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(2.) : The (<) direction is obvious. We now proceed to show (=). Take 
some time t after which all accepting runs on w separated. Assume that some 
accepting run p is not limit-deterministic. But then p goes through infinitely 
many forks after t which with positive probability lead to a successor from which 
the probability to accept is 0, and the probability of following p is also 0. As 
the probability to follow p until time t is positive, but after that the probability 
to accept is 0, this implies that there is a positive probability that A rejects 
w. Therefore, all accepting runs on w must be limit-deterministic. Now assume 
that some run p on w is rejecting. Following this run until the time at which p is 
separated from all accepting runs has positive probability and all continuations 
must be also rejecting, so A must reject w. 

(3.) : Clearly (=) holds, because a limit-deterministic rejecting run has pos- 
itive probability, i.e., if such a run exists on w, then A cannot accept almost 
surely. For (<=), observe that because A is flat, we know by Lemma 1 that 
with probability 1 runs are limit-deterministic. Hence, if there exists no limit- 
deterministic rejecting run on w (which would have positive probability), then 
with probability 1 runs are limit-deterministic and accepting. 


Using these characterizations, we can provide simple constructions from prob- 
abilistic to classical automata. 


Theorem 2. Let A be a PBA that is at most countably ambiguous. 
Then L*°(A) is a regular language. 


Proof (sketch). An NBA construction taking two copies of the PBA, where in 
the first copy no state is accepting and the second copy has no forks, with the 
purpose of guessing a limit-deterministic accepting run. 


Corollary 1. If L7°(A) is not regular, then it contains an EDA p pattern. 


Theorem 3. Let A be a PBA that is at most exponentially ambiguous or flat. 
Then L=1(A) is regular and recognizable by DBA. 


Proof (sketch). Both cases (exp. ambiguous or flat) shown using a deterministic 
breakpoint construction resulting in a DBA. In one case it checks whether all 
runs are accepting, in the other it checks that there are no limit-deterministic 
rejecting runs. 


Corollary 2. If L(A) is not regular, 
then A contains both an EDA and an IDA p pattern. 


The corollaries above follow directly from the theorems and the syntactic 
characterization of ambiguity classes [16]. The following proposition states that 
these characterizations of regularity in terms of the ambiguity patterns are tight. 
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Fig. 2: (a) Some PWA which accepts the non-regular language { w = (a+b)*$* | 
#a(w) > #»(w) } with a threshold of $, where #,(w) denotes the number of 
occurrences of x E€ X in w € X”. (b) A family of PBA P, from [4] such that 
L>°(P,) is not regular for any À € R. (c) A family of PWA Py (closely related 
to [4, Fig. 6]) such that L=!(P,) is not regular for any À € R. 


Proposition 3. There exist PBA... 


1. ...with EDAr pattern (i.e. uncountably ambiguous) that accept 
non-regular languages under positive semantics. 

2. ...with no EDAr pattern (i.e. countably ambiguous) that accept 
non-regular languages under almost-sure semantics. 


Proof. (1.) Note that this statement just means that there are PBA accepting 
non-regular languages, which is well known. For example, the automata family 
from [4, Fig. 3], depicted in Figure 2(b), accepts non-regular languages under 
positive semantics and clearly contains an EDA p pattern, e.g. there are two 
different paths from po to po on the word aab. 

(2.) The automata family depicted in Figure 2(c) is a simple modification 
of the PBA family depicted in [4, Fig. 6] and recognizes the same non-regular 
languages under almost-sure semantics. It does not contain an EDA p pattern, 
because the accepting state is a sink, but it does contain an IDA p and an EDA 
pattern (both e.g. on aab), so it is countably ambiguous and not flat. 


This completes our classification of regular subclasses of PBA under extremal 
semantics that are defined by ambiguity patterns, showing that going beyond the 
restricted classes presented above (by allowing more patterns) in general leads 
to a loss of regularity. 

Notice that the presented constructions do not track exact probabilities, just 
whether transitions have a probability > 0 or = 1. This is a noteworthy obser- 
vation, as in general, the probabilities do matter for PBA, as shown in [4, Thm. 
4.7, Thm. 4.11]. 


Proposition 4. Let A be a PBA. The exact probabilities in A do not influ- 
ence L>? (A) if A is at most countably ambiguous, and L=1(A) if A is at most 
exponentially ambiguous or flat. 
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3.3 Threshold Semantics 


In this section we consider PBA under threshold semantics and we will see that 
in this setting, we lose regularity much earlier than in the case of extremal 
semantics, but there is still the large and natural subclass of finitely ambiguous 
PBA that retains regularity. Before we can show this, we need to derive a suitable 
characterization of such languages. 

We derive it from the following simple observation, which was also used 
more implicitly in the proof that Simple HPBA with threshold semantics are 
equivalent to DBA in [9]. 


Lemma 3. Let A be a PBA. Then for every threshold À €]0,1], there exists a 
finite set of probability values Vs, C [A,1] such that for every finite run prefix 
with probability v in A we have v > à > v € Vsy. 


Proof. Observe that given a finite set of real numbers R C {0, 1], the set Rs) := 
{r |r = [[;ri =, ri € R} must be finite, as in any sequence pipe... of p; € R, 
only at most m = [log, (max R)] values can be < 1 and such that the product of 
the sequence remains > A. In our case, let R be the set of distinct probabilities 
assigned to edges (including the initial edges) in A. As every finite run prefix 
by definition has the probability given by the product of the edge probabilities, 


this implies the statement. 


If there is just one accepting run (i.e., the automaton is unambiguous), one 
can easily construct a nondeterministic automaton that guesses an accepting run 
and tracks it along with its probability value, of which there are only finitely 
many above the threshold. In the case that there are multiple accepting runs, for 
acceptance only the sum of their probabilities matters. As individual runs can 
in principle have arbitrarily small probability values, it is not obvious that the 
same approach (tracking a set of runs) can work. Determining a suitable cut-off 
point is not as simple, because it is not apparent when a single run becomes 
so improbable that it does not matter among the others. However, we will now 
show that such a cut-off point must exist: 


Lemma 4. Let A be a PBA, à €]0,1] a threshold and k € N. There exists 
Ek € ]0, à] such that for all sets Rt! = {piV_, of at most j < k different run 
prefizes in A of the same length t € N, Pr(R*) = X2 Pr(pi) < A implies that 
Pr( Rt) < À— Ek- 


Proof. We prove this by induction on the number of runs k. For k = 1, i.e. a 
single run prefix, let V>) be the finite (by Lemma 3) set of different probability 
values > A and let E be the set of distinct probabilities in the automaton A. 
Then clearly Umax,<, := max{a-b | a-b < r,a € Vsy,b € E} is the largest 
probability value < \ that can correspond to a finite run prefix in A. Hence, we 
can just pick an £1 < À — Umax,<, and immediately get that for any run prefix 
with probability v < A, we have that v < Umax,<, < À — £1- 

Now assume the statement holds for all sets with at most k run prefixes. 
Let Rt be a set of k + 1 of different run prefixes of the same length such that 
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Pr(R‘) < À and let £ := ex. Then we know that for every subset S of at most k 
runs of Rt we have Pr(.$) < \—e. Also, every single run prefix can by Lemma 3 
have one of only finitely many probability values in V> that are > £ and there 
exists a value Umax,<e denoting the largest possible probability value < € that a 
single run prefix can have. 

If there exists a run prefix p € Rt with probability value v < £, then we 
know that Pr(R’) = Pr(R’ \ {p}) +v < (A—€) + Umax,ce < À. If every run in 
Rt has a probability value > £, then every run prefix in Rt has as probability 
one of the values in V>. Consider all sums of k values from V>, which are 
finitely many, and pick the largest sum s which is < A. Choose ¢,41 such that 
Ek+1 < min(€ — Umax,<e, À — s) to account for both cases. 


From this we can derive the following characterization of languages accepted 
by finitely ambiguous PBA under threshold semantics: 


Lemma 5. Let A be a k-ambiguous PBA and A €]0, 1] a threshold. There exists 
an £ € ]0, à] such that for all w € SY’: w € L>>*(A) iff there exists a set R of 
limit-deterministic accepting runs of A on w with Pr(R) > A, Pr(S) < À for all 
S C R and at most one run p E€ R with Pr(p) < €. 


Proof. Clearly (+=) holds, as then w is accepted with probability > Pr(R) > A. 
We now show (=). In a finitely ambiguous PBA there are only finitely many 
different accepting runs on each word. Furthermore, as after finite time all ac- 
cepting runs have separated and each accepting run that visits forks infinitely 
often has probability 0, accepting runs that visit forks infinitely often do not con- 
tribute positively to the acceptance probability and thus can be ignored. Hence, 
if w € L>*(A), there is a number of accepting runs that eventually all become 
deterministic and each such run has a positive probability, which must in total 
be > À. 

Let R be a set of different limit-deterministic accepting runs of A on w such 
that Pr(R) > A and Pr(S) < A for all S C R. As there are only finitely many 
accepting runs, such a set R must exist. Furthermore, notice that each limit- 
deterministic run has a finite prefix which has the same probability as the whole 
run, so there exists a time t such that the probability of the set of all different 
prefixes of runs in R of length t is exactly Pr(R), so that Lemma 4 applies. 

Now pick an € := epą given by Lemma 4. We claim that at most one run 
p € R can have a probability less than e. If there is no such run in R, we are 
done. Otherwise let p be a run with Pr(p) =: p < £ and notice that by choice 
of R, we have that Pr(R \ {p}) =: s < X. It cannot be the case that s < À, as 
then by Lemma 4 we have s < A -— €, which implies that Pr(R) = s+p < A, 
which is a contradiction. Hence, now assume that s = A. But then, if there is 
any p' # p © R such that Pr(p’) =: p' < £, by the same argument we get the 
contradiction that s — p' < \—e and hence s < A. Therefore, no other run in R 
can have a probability < €. 


Now we can perform the intended automaton construction to show: 


Theorem 4. L>*(A) is regular for each k-ambiguous PBA A and A €]0, 1. 
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Proof (sketch). We use the characterization of Lemma 5 to construct a gener- 
alized Biichi automaton accepting L>*(A). Intuitively, the new automaton just 
guesses at most k different runs of A and verifies that the guessed runs are limit- 
deterministic and accepting. The automaton additionally tracks the probability 
of the runs over time, to determine whether the individual runs and their sum 
have enough “weight”. The automaton rejects when the total probability of the 
guessed runs is < A, one of the runs goes into the rejecting sink grej or a run 
does not see accepting states infinitely often. 

By Lemma 5 we only need to consider sets of runs with at most one run that 
has a probability < €, where € := eķ is given by Lemma 4. For this single run 
we also do not need to track the exact probability value, as its only purpose is 
to witness that the acceptance probability is strictly greater than A, whereas all 
other runs must have one of the finitely many different probabilities which are 
> e and must sum to A. 


This generalizes the corresponding result for PFA [12, Theorem 3]. The proof 
in [12] uses similar concepts, though a rather different presentation. In the setting 
of infinite words we additionally have to deal with a single run that has arbitrarily 
low probability, and we have to ensure that this probability remains positive. 

After seeing that finitely ambiguous PBA retain regularity, we show that this 
is the best we can do under threshold semantics: 


Corollary 3. There are polynomially ambiguous PBA A, that is, with an IDA 
pattern and no EDA,IDAp patterns, such that L>*(A) is not regular even for 
rational thresholds A €]0, 1[. 


Proof. Follows from the fact that the PWA A from Figure 2(a), which recognizes 
a non-regular language (and is used to show Proposition 6), has just an IDA 


pattern in the underlying NBA, but no EDA or IDA p patterns. 


This completes our characterization of languages which are recognized by 
PBA that are restricted by forbidden ambiguity patterns, so that we can state 
our main result of this section (see Figure 1 for a visualization): 


Theorem 5. The following results hold about PBA with restricted ambiguity: 


— L>°(k-PBA) = L>? (Xo-PBA) = L(NBA) 
— L=!(k-PBA) = L=!(2'-PBA) = L=!(flat PBA) = L(DBA) C L=!(No-PBA) 
— L>)(k-PBA) = L(NBA) C L>>(n*-PBA) 


Proof. The statements follow from the following inclusion chains: 


(L) sü def. o (2.) 
L(NBA) C L% (k-PBA) C L7 (Xo-PBA) C L(NBA) 


) 


iry def. Syok. (4.) (5.) Ži g 
L=! (k-PBA) È L=1(2*-PBAU flat PBA) C L(DBA) C L=!(No-PBA) 
(8.) 


(3. 
L(DBA) € 


(L) (6.) (7.) 8. 
L(NBA) C L?°(k-PBA) C L?*(k-PBA) C L(NBA) C L?*(n*-PBA) 
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Where the marked relationships hold due to: (1.) Theorem 1, (2.) Theorem 2, 
(3.) Proposition 2, (4.) Theorem 3, (5.) Proposition 3, (6.) Simple transformation 
by adding a new accepting sink gace and modifying the initial distribution po [4, 
Lemma 4.16], (7.) Theorem 4, (8.) Corollary 3, and (def.) by definition of the 
ambiguity-restricted automata classes. 


4 Complexity results 


In this section, we state some upper and lower bounds on the complexity for 
deciding emptiness and universality for PBA with restricted ambiguity, derived 
from the characterizations and constructions presented above. 


Theorem 6. 


1. the emptiness problem for Xo-PBA7°® is in NL 
2. the universality problem for Xo-PBA*° is in PSPACE 
3. the universality problem for at most exp. ambiguous or flat PBA=! is in NL 


Proof. (1. + 2.) : By Theorem 2 the languages of No-PBA7*® are regular. The 
construction of an NBA just uses two copies of the given PBA. For emptiness, 
it thus suffices to guess an accepted ultimately periodic word and verify that it 
is accepted by the NBA, which can be done in NL. Since universality for NBA 
in in PSPACE [21], we also obtain (2.). 

(3.): If the automaton is at most exponentially ambiguous, there are only 
finitely many accepting runs on each word and as we know by Lemma 2 that 
w € L=1(A) iff all runs are accepting, it suffices to guess a rejecting run in 
A<, which implies that the ultimately periodic word w labelling that run can 
not be in L=1(A). If the automaton is flat, then we know that for each rejected 
word there must exist a limit-deterministic rejecting run in the underlying NBA, 
which we also can guess. 


Type regular? Emptiness Universality 
>O=1>A} >0 =1 >0 =1 
k-PBA 
n*-PBA e NL € PSPACE | € PSPACE E NL 
2”-PBA y a 
ray | NL c. € PSPACE ce PSPACE c. -DSpace 


Table 1: Summary of main results from Theorems 5 and 6 concerning PBA with 
ambiguity restrictions. The completeness results follow from the hardness results 
for HPBA (which are subsumed by flat PBA) from [8, Section 5], the PSPACE 
inclusion of universality for almost-sure No-PBA follows from |8, Theorem 4.4]. 


Observe that No-PBA?® subsume HPBA”® and the union of flat PBA=! and 
exp. ambiguous PBA=! subsumes HPBA*!, while preserving the same complex- 
ity of the emptiness and universality problems. A summary of the main results 
from Theorem 5 and Theorem 6 is presented in Table 1. 
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We conclude with an observation relevant to the question about feasibility 
of PBA with restricted ambiguity for the purpose of application in e.g. model- 
checking or synthesis. 


Proposition 5 (Relationship to classical formalisms). 


— There is a doubly-exponential lower bound for translation from LTL formula 
to countably ambiguous PBA with positive semantics. 

— There is an exponential lower bound for conversion from NBA to countably 
ambiguous PBA with positive semantics. 


Proof. It is known [20, Theorem 2] that there is a doubly-exponential lower 
bound from LTL to LDBA. It is also known that LTL to NBA has an exponential 
lower bound (e.g. [5, Theorem 5.42]), which implies an exponential lower bound 
from NBA to LDBA. 

By Theorem 2 there is a polynomial transformation from countably ambigu- 
ous PBA with positive semantics into LDBA, which together with the aforemen- 
tioned bounds implies the claimed lower bounds. 


5 Weakness in Probabilistic Biichi Automata 


In this section we investigate the class of probabilistic weak automata (PWA), 
establishing the relation between different classes defined by PWA as shown in 
Figure 3 (see also the description of our contribution in the introduction). 

As a first remark, notice that PWA can be “complemented” by inverting 
accepting and rejecting states and switching between dual semantics, e.g., for a 
PWA A we have L>°(A) = L=!(A), where A is just A with inverted accepting 
state set F’ = Q \ F. 

Since the overarching theme of this paper is trying to find regular subclasses 
of PBA, we will next establish the following result, showing that there is no hope 
to find a complete syntactical characterization of regularity in PBA: 


Theorem 7. The regularity of PWA (and therefore of PBA) under positive, 
almost-sure and threshold semantics is an undecidable problem. 


Proof (sketch). Since L>*(PWA) > L?°(PWA) (see Theorem 10), L7°(PWA) = 
L=!(PWA), and the class of regular w-languages is closed under complement, it 
suffices to show the statement for PWA=!. We do this by reduction from the 
value 1 problem for PFA, which is the question whether for each € > 0 there ex- 
ists a word accepted by the PFA with probability > 1—¢. This problem is known 
to be undecidable [13]. We consider a slightly modified version of the problem 
by assuming that no word is accepted with probability 1 by the given PFA. The 
problem remains undecidable under this assumption, because one can check if a 
PFA accepts a finite word with probability 1 by a simple subset construction. 
Given some PFA A, we construct a PWA=! B by taking a copy of A and 
extending it with a new symbol # such that from accepting states of A the 
automaton is “restarted” on #, while from non-accepting states # leads into a 
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L>*(PBA) 


L>*(PWA) 


PWa?/? 
=1(PBA) 0 L=!(PBA)| 4|' 


(DBA) N L(DBA) 
L(DWA) 


Fig. 3: Illustration of relationships between the class of languages accepted 
by weak probabilistic automata under various semantics with other already 
known classes. The overlapping patterns indicate intersection of classes, where 
dots mark L7°(PBA), and different diagonal lines respectively L=1(PBA) and 
.=!(PBA). The dashed line indicates intersections with different subclasses of 
regular languages. The class L>À(PBA) contains all the other depicted classes, 
L>A(PWA) contains the area inside the thick line. The depicted fact that 
>9(PWA) = L?4(PWA) N L7°(PBA) is a conjecture, one direction is shown 
in Theorem 10. 


new part which ensures that infinitely many # are seen and contains the only 
accepting state of B. We show that L=!(B) = (©*#)” \ R, where R = 0) if A 
does not have value 1, and R is non-empty but does not contain an ultimately 
periodic word, otherwise. This implies that L=1(B) is regular iff A does not have 
value 1. 


We will now show that PWA with almost-sure semantics are as expressive as 
PBA, and with positive semantics as expressive as PCA. 


Theorem 8. L>°(PWA) = L>°(PCA) and L=!(PWA) = L=!(PBA). 


Proof (sketch). It suffices to show the first statement. The second then fol- 
lows by duality, i.e., we can interpret a PBA! A recognizing L as a PCA*® 
recognizing L and just apply the construction to get a PWA>? B for L, such 
that B (with inverted accepting and rejecting states) is a PWA=! for L. In 
the first statement the C inclusion is trivial, hence we only need to show that 
L> (PCA) C L?°(PWA). 

We construct a PWA>? consisting of two copies of the original PCA7®, a 
guess copy and a verify copy. In the first copy, the automaton can guess that 
no final states will be visited anymore and switch to the verify copy, which is 
accepting, but where all transitions into final states are redirected to a rejecting 
sink. 


Next, we show that languages that can be accepted by both, a PWA with 
almost-sure semantics, and by a PWA with positive semantics, are regular and 
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can be accepted by a DWA. For the proof, we rely on a characterization of 
DWA languages in terms of the Myhill-Nerode equivalence relation from [22]. So 
we first define this equivalence, and show that languages defined by PBA with 
positive semantics have only finitely many equivalence classes. Then we come 
back to the result for PWA. 

For L C X”, define the Myhill-Nerode equivalence relation ~C X* x X* by 
u ~g v iff uw E€ L & vw E L for all w € X”. Then the following holds: 


Lemma 6 (Finitely many Myhill-Nerode classes). 
Languages in L>? (PBA) have finitely many Myhill-Nerode equivalence classes. 


Proof. Let A = (Q, X, ô, uo, F) be some PBA*® and u € X* some word and let 
Hu := ð* (uo, u) be the probability distribution on states of A after reading u. 
Pick any w € X“ and notice that uw € L = L>? (A) iff there exists some state 
q such that uu(q) > 0 and the probability to accept w from q is also > 0, as the 
product of two positive numbers clearly still is positive. But then, for any two 
u,v € X* we have that whenever p,(q) > 0 & m(q) > 0 for all q, then we have 
uw E€ L & vw E L for all w € &” by the reasoning above, as the exact value 
does not matter for acceptance, and therefore u ~z v. But as there are only at 
most 2!@! different possibilities how values in a distribution u over Q are either 
equal to or greater than 0, this is an upper bound on the number of different 
equivalence classes. 


Theorem 9. L>°(PWA) N L=!(PWA) = L(DWA) = L(PWA°’?) 


Proof. The inclusions L(DWA) C L(PWA®°/t) C L>°(PWA)  L=!(PWA) are 
trivial, hence it remains to show L7°(PWA) N L=!(PWA) C L(DWA). 

So let L be a language from L>? (PWA) N L=!(PWA). We want to show that 
L can be accepted by a DWA. We use the following characterization of DWA 
languages [22, Theorem 21]: The DWA languages are precisely the languages with 
finitely many Myhill-Nerode classes in the class Gs N Fo in the Borel hierarchy. 
The classes Gs and F, of the Borel hierarchy are often also referred to as IT, 
and X2. We do not introduce the details of this hierarchy here, but rather refer 
the reader not familiar with these concepts to [22] and [8]. 

We already know that L has finitely many Myhill-Nerode classes by Lemma 6 
(as PWA are special cases of PBA). It remains to show that L is in the class 
Gs N Fo. It is known that PBA with almost-sure semantics define languages 
in Gs [8, Lemma 3.2]. Hence L is in Gs. Since L is accepted by a PWA with 
positive semantics, the complement of L is accepted by a PWA with almost- 
sure semantics (as noted at the beginning of this section). We obtain that the 
complement of L is also in Gs again by [8, Lemma 3.2]. This means that L is in 
Fo, which by definition consists of the complements of languages from Gs. 


Concluding this section, we show a result about weak automata with thresh- 
old semantics, which (not surprisingly) turn out to be even more expressive. A 
careful analysis of the PWA A in Fig. 2(a) shows the following result: 


Proposition 6. For all thresholds  €]0,1[ there exists a PWA A such that 
L>*(A) is not regular and not PBA*® recognizable. 
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Putting things together, we can say the following about threshold PWA, 
establishing the relation of L7*(PWA) to the other classes in Figure 3: 


Theorem 10 (Expressive power of threshold PWA). 


1. L>°(PWA) C L>\(PWA) Nn L>? (PBA). 
. L?4(PWA) and L>? (PBA) are incomparable (wrt. set inclusion). 
3. L>? (PWA) c L>A(PWA) c L>À(PBA). 


Proof. (1.) L>? (PWA) C L?°(PBA) by definition and L>? (PWA) C L>à(PWA), 
as any PWA*® can be modified to a PWA>À recognizing the same language by 
just adding an additional accepting sink and modifying the initial distribution, 
just as described in [4, Lemma 4.16] for general PBA. 

(2.) By Proposition 6, there are languages recognized by PWA>* that cannot 
be recognized with PBA7°. To show that there are languages accepted by PBA*° 
that cannot be accepted by PWA?>À we can give a topological characterization 
of languages accepted by PWA by a simple adaptation of [8, Lemma 3.2] and 
combine it with other results shown in [8] to show that there are PBA*® that 
accept languages that cannot be accepted by PWA>?. 

(3.) The first inclusion was discussed in (1.), the strictness follows from 
Proposition 6 and the fact that L7°(PWA) = L=!(PBA) c BCI(L=!(PBA)) = 
L*°(PBA), where the first equality is Theorem 8 and the second is shown in [8]. 
The second inclusion of the statement follows from (2.) and the fact from [4] 
that L>? (PBA) c L>*(PBA). 


For the dual class L2*(PWA) one can show symmetric results that correspond 
to statements (1.) and (2.) above, for statement (3.) however there is no proof 
yet for the strictness of the inclusions (especially the second one), whereas the 
statement L=!(PWA) C L2*(PWA) C LZ>(PBA) is obvious. We leave this issue 
as an open question. Another interesting question is whether > A is equivalent 


to < À (or dually for > / <). 


6 Conclusion 


By using notions from ambiguity in classical Biichi automata, we were able to 
extend the set of easily (syntactically) checkable PBA which are regular under 
some or all of the usual semantics. As a consequence, ambiguity appears to 
be an even more interesting notion in the probabilistic setting, as here it in 
fact has consequences for the expressive power of automata, whereas in the 
classical setting there is no such effect. Our results also indicate that to get 
non-regularity, one requires the use of certain structural patterns which at least 
imply the existence of the ambiguity patterns that we used. It is an open question 
whether it is possible to identify more fine-grained syntactic characterizations, 
patterns or easily checkable properties which are just over-approximated by the 
ambiguity patterns and are required for non-regularity. 
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Abstract. Modelling and reasoning about dynamic memory allocation 
is one of the well-established strands of theoretical computer science, 
which is particularly well-known as a source of notorious challenges in 
semantics, reasoning, and proof theory. We capitalize on recent progress 
on categorical semantics of full ground store, in terms of a full ground 
store monad, to build a corresponding semantics of a higher order logic 
over the corresponding programs. Our main result is a construction of an 
(intuitionistic) Bl-hyperdoctrine, which is arguably the semantic core of 
higher order logic over local store. Although we have made an extensive use 
of the existing generic tools, certain principled changes had to be made to 
enable the desired construction: while the original monad works over total 
heaps (to disable dangling pointers), our version involves partial heaps 
(heaplets) to enable compositional reasoning using separating conjunction. 
Another remarkable feature of our construction is that, in contrast to the 
existing generic approaches, our BI-algebra does not directly stem from 
an internal categorical partial commutative monoid. 


1 Introduction 


Modelling and reasoning about dynamic memory allocation is a sophisticated 
subject in denotational semantics with a long history (e.g. [19,15,14,16]). De- 
notational models for dynamic references vary over a large spectrum, and in 
fact, in two dimensions: depending on the expressivity of the features being 
modelled (ground store — full ground store — higher order store) and depending 
on the amount of intensional information included in the model (intensional — 
extensional), using the terminology of Abramsky [1]. 

Recently, Kammar et al [9] constructed an extensional monad-based denota- 
tional model of the full ground store, i.e. permitting not only memory allocation 
for discrete values, but also storing mutually linked data. The key idea of the lat- 
ter work is an explicit delineation between the target presheaf category [W, Set] 
on which the full ground store monad acts, and an auxiliary presheaf category 
[E, Set] of initializations, naturally hosting a heap functor H. The latter category 
also hosts a hiding monad P, which can be loosely understood as a semantic 
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[W, Set] L [E, Set] L [E, Set] ) P 
Wa ~ 
i (-) 


Fig. 1: Construction of the full ground store monad. 


mechanism for idealized garbage collection. The full ground store monad is then 
assembled according to the scheme given in Fig. 1. As a slogan: the local store 
monad is a global store monad transform of the hiding monad sandwiched within 
a geometric morphism. 

The fundamental reason, why extensional models of local store involve intricate 
constructions, such as presheaf categories is that the desirable program equalities 
include 


let Z := newu; l := newwinp = let l := newu; €:= newvinp (L) 
let 2 := newvinretx = retx 
let £ := newvin (if £ = / then true else false) = false (#0) 


and these jointly do not have set-based models over countably infinite sets of 
locations [23, Proposition 6]. The first equation expresses irrelevance of the 
memory allocation order, the second expresses the fact that an unused cell is 
always garbage collected and the third guarantees that allocation of a fresh 
cell does indeed produce a cell different from any other. The aforementioned 
construction validates these equations and enjoys further pleasant properties, e.g. 
soundness and adequacy of a higher order language with user defined storable 
data structures. 

The goal of our present work is to complement the semantics of programs over 
local store with a corresponding principled semantics of higher order logic. In order 
to be able to specify and reason modularly about local store, more specifically, 
we seek a model of higher order separation logic [21]. It has been convincingly 
argued in previous work on categorical models of separation logic [2,3] that a core 
abstraction device unifying such models is a notion of BI-hyperdoctrine, extending 
Lawvere’s hyperdoctrines [10], which provide a corresponding abstraction for the 
first order logic. Bl-hyperdoctrines are standardly built on Bl-algebras, which 
are also standardly constructed from partial commutative monoids (pcm), or 
more generally from resource algebras as in the IRIS state of the art advanced 
framework for higher order separation logic [8]. One subtlety our construction 
reveals is that it does not seem to be possible to obtain a Bl-algebra following 
general recipes from a pcm (or a resource algebra), due to the inherent local 
nature of the storage model, which does not allow one to canonically map store 
contents into a global address space. Another subtlety is that the devised logic 
is necessarily non-classical, which is intuitively explained by the fact that the 
semantics of programs must be suitably irrelevant to garbage collection, and in 
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our case this follows from entirely formal considerations (Yoneda lemma). It is 
also worth mentioning that for this reason the logical theory that we obtain is 
incompatible with the standard (classical or intuitionistic) predicate logic. E.g. 
the formula 32. £ — 5 is always valid in our setup, which expresses the fact that 
a heap potentially contains a cell equal to 5 (which need not be reachable) — this 
is in accord with the second equation above — and correspondingly, the formula 
YZ. =(€ — 5) is unsatisfiable. This and other similar phenomena are explained 
by the fact that our semantics essentially behaves as a Kripke semantics along 
two orthogonal axes: (proof relevant) cell allocation and (proof irrelevant) cell 
accessibility. While the latter captures a programming view of locality, the latter 
captures a reasoning view of locality, and as we argue (e.g. Example 26), they 
are generally mutually irreducible. 


Related previous work As we already pointed out, we take inspiration 
from the recent categorical approaches to modelling program semantics for 
dynamic references [9], as well as from higher order separation logic semantic 
frameworks [2]. Conceptually, the problem of combining separation logic with 
garbage collection mechanisms goes back to Reynolds [20], who indicated that 
standard semantics of separation logic in not compatible with garbage collection, 
which we also reinforce with our construction. Calcagno et al [4] addressed this 
issue by providing two models. The first model is based on total heaps, featuring 
the aforementioned effect of “potential” allocations. To cope with heap separation 
the authors introduced another model based on partial heaps, in which this effect 
again disappears, and has to be compensated by syntactic restrictions on the 
assertion language. 


Plan of the paper After preliminaries (Section 2), we give a modified presen- 
tation of a call-by-value language with full ground references and the full ground 
store monad (Sections 3 and 4) following the lines of [9]. In Section 5 we provide 
some general results for constructing semantics of higher order separation logics. 
The main development starts in Section 6 where we provide a construction of a 
Bl-hyperdoctrine. We show some example illustrating our semantics in Section 7 
and draw conclusions in Section 8. 


2 Preliminaries 


We assume basic familiarity with the elementary concepts of category theory [12,6], 
all the way up to monads, toposes, (co)ends and Kan extensions. We denote by 
|C] the class of objects of a category C; we often suppress subscripts of natural 
transformation components if no confusion arises. 

In this paper, we work with special kinds of covariant presheaf toposes, 
i.e. functor categories of the form [C, Set], where C is small and satisfies the 
following amalgamation condition: for any f: a > b and g: a —> c there exist 
g': b — d and f’: c — d such that f'o g= g'o f. Such toposes are particularly 
well-behaved, and fall into the more general class of De Morgan toposes [7]. As 
presheaf toposes, De Morgan toposes are precisely characterized by the condition 


Local Local Reasoning: A Bl-Hyperdoctrine for Full Ground Store 545 


TH, £: Refs I by v: CType(S) TH, £: Refs 
ipui Teelo: l (get) i es Ciype(S) 


I, 41: Refs,,...,€n: Refs, bv v1: CType(S1) 


I, 41: Refs,,...,€n: Refs Hv un: CType(Sn) 
I, 41: Refs,,...,n: Refs, Hc p: A 
I He letref 41 := v1,...,n := Un inp: A 


(new) 


Fig. 2: Term formation rules for memory management constructs. 


that 2 = 1 + 1 is a retract of the subobject classifier 2. More specifically, our C 
support further useful structure, in particular, a strict monoidal tensor ® with 
jointly epic injections inj, ing, forming an independent coproduct structure, as 
recently identified by Simpson [22]. Moreover, if the coslices c | C support 
independent products, we obtain local independent coproducts in C, which are 
essentially cospans cı > c1 e C2 — C2 inc | C. Given p1: € > cı and p2: € > C2, 
we thus always have pı è p2: C1 > C1 Oe C2 and p2 © p1: S2 > cy Qe C2, such that 
(pı © p2) © pı = (p2 © p1) © p2, and as a consequence, [C, Set] is a De Morgan 
topos. Intuitively, the category C represents worlds in the sense of possible world 
semantics [15,19]. A morphism p: a — b witnesses the fact that b is a future 
world w.r.t. a. Existence of local independent products intuitively ensures that 
diverse futures of a given world can eventually be unified in a canonical way. 

Every functor f: C — D induces a functor f*: [D,Set] — [C,Set] by 
precomposition with f. By general considerations, there is a right adjoint 
f.: [C, Set] — [D,Set], computed as Ran;, the right Kan extension along fî. 
This renders the adjunction f* 4 fx, as a geometric morphism, in particular, f* 
preserves all finite limits. 


3 A Call-by-Value Language with Local References 


To set the context, we consider the following higher order language of programs 
with local references by slightly adapting the language of Kammar et al [9] to 
match with the fine-grain call-by-value perspective [11]. This allows us to formally 
distinguish pure and effectful judgements. First, we postulate a collection of cell 
sorts S and then introduce further types with the grammar: 


A,B...:=0|1|Ax B|A+B|A—B| Refs (Ses) (1) 


A type is first order if it does not involve the function type constructors A > B. 
We then fix a map CType, assigning a first order type to every given sort from S. 
We show three term formation rules over these data in Fig. 2 specific to local store. 
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Here the v-indices at the turnstiles indicate values and the c-indices indicate 
computations. In (put) the cell referenced by £ is updated with a value v, (get) 
returns a value under the reference / and (new) simultaneously allocates new 
cells filled with the values v1,...,Un and makes them accessible in p under 
the corresponding references ¢,,...,£,. A fine-grain call-by-value language is 
interpreted standardly in a category with a monad, which in our case must 
additionally provide a semantics to the rules (put), (get) and (new). We 
present this monad in detail in the next section. 


Example 1 (Doubly Linked Lists). Let S = {DLIList} and let 
CType( DL List) = 2 x (Ref prise +1) x (Ref DLList +1), which indicates that a list 
element is a Boolean (i.e. an element of 2 = 1 + 1) and two pointers (forwards 
and backwards) to list elements, each of which may be missing. Note that we 
thus avoid empty lists and null-pointers: every list contains at least one element, 
and the elements added by +1 cannot be dereferenced. This example provides a 
suitable illustration for the letref construct. E.g. the program 


letref 41 := (0, inr x, inl l2); l2 := (1, inl 44, inr») in ret 4 


simultaneously creates two list elements pointing to each other and returns a 
reference to the first one. 


4 Full Ground Store in the Abstract 


We proceed to present the full ground store monad by slightly tweaking the 
original construction [9] towards higher generality. The main distinction is that we 
do not recur to any specific program syntax and proceed in a completely axiomatic 
manner in terms of functors and natural transformations. This mainly serves the 
purpose of developing our logic in Section 6, which will require a coherent upgrade 
of the present model. Besides this, in this section we demonstrate flexibility of 
our formulation by showing that it also instantiates to the model previously 
developed by Plotkin and Power [16] (Theorem 8). 

Our present formalization is parametric in three aspects: the set of sorts S, 
the set of locations £ and a map range, introduced below for interpreting S. We 
assume that £ is canonically isomorphic to the set of natural numbers N under 
: L = N. Using this isomorphism, we commonly use the “shift of £ € £ by 
n e N”, defined as follows: +n = #71(#0+n). 


Heap layouts and abstract heap(let)s Let W be a category of (heap) layouts 
and injections defined as follows: an object w € |W] is a finitely supported partial 
function w: L —fin S and a morphism p: w —> w is a type preserving injection 
p: dom w —> dom w’, i.e. for all l € img w, w(£) = w'(p(£)). We will equivalently 
view w as a left-unique subset of £ x S and hence use the notation (l: S) € w 
as an equivalent of w(£) = S. Injections p: w > w’ with the property that 
w(l: S) = £: S for all (2: S) € w we also call inclusions and write w S w’ instead 
of p: w > w', for obviously there is at most one inclusion from w to w’. If w € w’ 
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then we call w a sublayout of w’. We next postulate 
range: S > [W, Set]. 


The idea is, given a sort S € S and a heap layout w € |W], range(S)(w) yields 
the set of possible values for cells of type S over w. 


Example 2. Assuming the grammar (1) and a corresponding map CType, a 
generic type A is interpreted as a presheaf A: W — Set, by obvious structural 
induction, e.g. A x B = Ax B, except for the clause for Ref, for which (Ref s)w = 
w™(S). This yields the following definition for range: range(S) = CType(S) [9]. 


Example 3 (Simple Store). By taking S = {x}, £ = N (natural numbers) 
and range(x)(w) = V where Y is a fixed set of values, we essentially obtain the 
model previously explored by Plotkin and Power [16]. We reserve the term simple 
store for this instance. Simple store is a ground store (since range is a constant 
functor), moreover this store is untyped (since S = {x}) and the locations £ are 
precisely the natural numbers. 


A heap over a layout w assigns to each (£: S) € w an element from range(S)(w). 
More generally, a heaplet over w assigns an element from range(S)(w) to some, 
possibly not all, (£: S) € w. We thus define the following heaplet bi-functor 
H: WP x W — Set: 


H(w ,wt) = LP ee range(S) (w) 


and identify the elements of H(w7, wt) with heaplets and the elements of H(w, w) 
with heaps. Of course, we intend to use H(w7,wt) for such w7 and wt that 
the former is a sublayout of the latter. The contravariant action of H is given by 
projection and the covariant action is induced by functoriality of range(S). 


Pre: s\(H(w pi: wy > wz )(n € H(w” ,w7))) = range(S)(p1)(prce: s) n) 
Pre: s)(H(p2: w > w, w™)(n Ee H(w, ,w™))) = Pryce: s)” 


The heaplet functor preserves independent coproduct, we overload the ® operation 
with the isomorphism ®: H(w1, w) x H(w2, w) = H(wi ® we, w). 


Example 4. For illustration, consider the following simplistic example. Let 
S = { Int, Ref mt, Refrer,,,, -.. } where Int is meant to capture the ground type 
of integers and recursively, Ref 4 is the type of pointers to A. Then, we put 


range(Int)(w) =Z, range(Refs)(w) = w7(S) = {2€ domw | w(£) = S}. 


For a heaplet example, consider w7 = {4: Int, b2: Refi} and wt = 
{4 : Int, l2: Refi, l3: Int}. Hence, w~ is a sublayout of wt. By viewing the 
elements of H(w7,wt) as lists of assignments on w7, we can define 51,82 € 
H(w ,wt) as follows: sı = [4 : Int > 5, l2: Refint > 4], s2 = [f1: Int > 3, 
lz: Refint > £3]. The heaplets sı and s2 can be graphically presented as follows: 
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Fig. 3: Local independent coproduct 


w7 wt wT wt 


The category W supports (local) independent coproducts described in Section 2. 
These are constructed as follows. For w, w €|C|, w9 w =wvu {l+n41: S| 
(£,c) € w} with n being the largest index for which w is defined on +71(n). 
This yields a strict monoidal structure ®: W x W — W. Intuitively, wı ® w2 
is a canonical disjoint sum of wı and w2, but note that ® is not a coproduct 
in W (e.g. there is no V: 1® 1 — 1, for W only contains injections). For every 
p: wı — We, there is a canonical complement po: w2 © p > w2 whose domain 
w2 © p = w2 \ imgp consists of all such cells (/: S) € wo that p misses. Given 
two morphisms pı: w > wy, and p2: w —> we, we define the local independent 
coproduct wı Dw w2 as the layout consisting of the locations from w, and the 
ones from w and ws which are neither in the image of pı nor in the image of p2: 


pı Du p2 = w È (wi O p1) © (w2 © p2). 


There are morphisms w, 21°, pı ®y p2 and wg £2°*1, pı Ðu p2 such that 


P2 


w W2 
r| [peers 
P1ep2 
w1 pı Ðu p2 


Fig. 3 illustrates this definition with a concrete example. 


Initialization and hiding Note that in the simple store model (Definition 3), H 
is equivalently a contravariant functor H: W°? — Set with Hw = VY, hence H 
can be placed e.g. in [W°?, Set]. In general, H is mix-variant, which calls for 
a more ingenious category where H could be placed. Designing such category 
is indeed the key insight of [9]. Closely following this work, we introduce a 
category E, whose objects are the same as those of W, and the morphisms 
e€ € E(w, w’), called initializations, consist of an injection p: w > w’ and a 
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heaplet n € H(w’ © p, w’): 
1 / 1 
E(w, w ) ka a ies ai H(w Op, w J; 


Recall that the morphism p: w — w’ represents a move from a world with w 
allocated memory cells a world with w’ allocated memory cells. A morphism of 
E is a morphism of W augmented with a heaplet part 7, which provides the 
information how the newly allocated cells in w’ © p are filled. The heap functor 
now can be viewed as a representable presheaf H : E — Set essentially because by 
definition, Hw = H(w,w) = E(f, w). Let us agree to use the notation €: w ~ w’ 
for morphisms in E to avoid confusion with the morphisms in W. 

Like W, E supports local independent coproducts, but remarkably E does 
not have vanilla independent coproducts, due to the fact that E does not have 
an initial object. That is, in turn, because defining an inital morphism would 
amount to defining canonical fresh values for newly allocated cells, but those need 
not exist. The local independent coproducts of W and E agree in the sense that 
we can promote an initialization (p2,7): w ~> we along an injection p1: w > wy 
to obtain an initialization pı è (p2,7): w1 ~> p1 Bw, P2. This is accomplished by 
mapping the heaplet structure 7 forward along p2 è pı: w2 > pı Ow p2- 


Hiding monad Recall that the local store is supposed to be insensitive to 
garbage collection. This is captured by identifying the stores that agree on their 
observable parts using the hiding monad P defined on [E, Set] as follows: 


p: wow'’ewlu 
(PX)w =i Xw. (2) 


Here, u: E — W is the obvious heaplet discarding functor u(p, n) = p. Intuitively, 
in (2), we view the locations of w as public and the ones of w’ © p as private. The 
integral sign denotes a coend, which in this case is just an ordinary colimit on 
Set and is computed as a quotient of > nea Xw under the equivalence 
relation ~ obtained as a symmetric-transitive closure of the relation 


(p: w > w, x E€ Xw) < (uco p: w > wz, (Xe) (x) € Xw) (e€: wı ~> w2) 


Note that < is a preorder. Moreover, it enjoys the following diamond property. 
Proposition 5. If (p,x) < (p1, %1) and (p, Œ) < (pa, a2) then (pi, 21) < (p', 2") 
and (p2, £2) < (p',x') for a suitable (p', x’). 

Hence (p1, 21) ~ (p2, z2) iff (P1, 21) X (p, 2), (P2, 22) < (p, £) for some (p, x). 


Example 6. To illustrate the equivalence relation ~ behind P, we revisit the 
setting of Example 4. Consider the following situations: 


Ta a £ 


z o 
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Here, the solid lines indicate public locations and the dotted lines indicate 
private locations. The left equivalence holds because the private locations are 
not reachable from the public ones by references (depicted as arrows). On the 
right, although the public parts are equal, the reachable cells of the private parts 
reveal the distinction, preventing the equivalence under ~. Intuitively, hiding 
identifies those heaps that agree both on their public and reachable private part. 


The covariant action of PX (on E) is defined via promotion of initializations: 
(PX )(e: w ~ we)(p: wi > w4, z Ee Xwi)~ 
= (ue © p: w2 > p®u, ue, X(p e e)(2))~ 


Furthermore, there is a contravariant hiding operation (on W) given by the 
canonical action of the coend: for p: w > w’, we define hidep: PXw' > PXw: 


hide, (p: w > w", ce Xw") = (p'o p,£)~ (3) 


This allows us to regard P both as a functor [E, Set] — [E, Set] and as a functor 
[E, Set] — [W°?, Set]. 


Full ground store monad We now have all the necessary ingredients to 
obtain the full ground store monad T on [W, Set]. This monad is assembled 
by composing the functors in Fig. 1 in the following way. First, observe that 
(P(- xH))" is a standard (global) store monad transform of P on [E, Set]. 
This monad is sandwiched between the adjunction u, H} u* induced by u (see 
Section 2). Since any monad itself resolves into an adjunction, sandwiching in it 
between an adjunction again yields a monad. In summary, 


T= ([W, Set] t, [E, Set] 200", [E, Set] [W, Set]). (4) 


Theorem 7. The monad T, defined by (4) is strong. 


Proof. The proof is a straightforward generalization of the proof in [9]. 


We can recover the monad previously developed by Plotkin and Power [16] by 
resorting to the simple store (Example 3). 


Theorem 8. Under the simple store model T is isomorphic to the local store 
monad from [16]: 


p: w>w'ew, W ; y= 
axwa( f xw xv’) : 


Using (4), one obtains the requisite semantics to the language in Fig. 2 using 
the standard clauses of fine-grain call-by-value [11], except for the special clauses 
for (put), (get) and (new), which require special operations of the monad: 


get: u*Refs x H — u*CType(S) x H 
put: (u*Refs x u*CType(S)) x H 1x H 
new: u*(CType(S)B*S) x H — P(u*Refs x H) 
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5 Intermezzo: Bl-Hyperdoctrines and BI-Algebras 


To be able to give a categorical notion of higher order logic over local store, 
following Biering et al [2], we aim to construct a BI-hyperdoctrine. 

Note that algebraic structures, such as monoids and Heyting algebras can be 
straightforwardly internalized in any category with finite products, which gives 
rise to internal monoids, internal Heyting algebras, etc. The situation changes 
when considering non-algebraic properties. In particular, recall that a Heyting 
algebra A is complete iff it has arbitrary joins, which are preserved by binary 
meets. The corresponding categorical notion is essentially obtained from spelling 
out generic definitions from internal category theory [6, B2] and is as follows. 


Definition 9 (Internally Complete Heyting Algebras). An internal Heyt- 
ing (Boolean) algebra A in a finitely complete category C is internally complete 
if for every f € C(I, J), there exist indexed joins Vp: C(I, A) > C(J, A), left 
order-adjoint to (-) o f: C(J, A) > C(I, A) such that for any pullback square 
on the left, the corresponding diagram on the right commutes (Beck-Chevalley 
condition): 


(-)of 


| a iy 
I’ —_ J' C(J’, A) ————— C(I',A 
7 (J,A) a CU, A) 


It follows generally that existence of indexed joins V implies existence of indexed 
meets A, which then satisfy dual conditions ([6, Corollary 2.4.8]). 


Remark 10 (Binary Joins/Meets). The adjointness condition for indexed 
joins means precisely that V; o < y if ọ < yo f for every ¢: I > A and every 
w: J > A. If C has binary coproducts, by taking f = V: X + X — X we obtain 
that Vo ¢ < v iff ọ < [y, Y] iff ġo inl < y and ġo inr < y. This characterizes 
Vvlo1, 2]: X — A as the binary join of ¢1,¢2: X — A. Binary meets are 
characterized analogously. 


Definition 11 ((First Order) (BI-)Hyperdoctrine). Let C be a category 
with finite products. A first order hyperdoctrine over C is a functor S: CP > 
Poset with the following properties: 


1. given X e |C], SX is a Heyting algebra; 

2. given fe C(X,Y), Sf: SY — SX is a Heyting algebra morphism; 

3. for any product projection fst: X x Y > X, there are (JY )x: S(X x Y) > 
SX and (YY )x: S(X x Y) > SX, which are respective left and right order- 
adjoints of S fst: S(X x Y) > SX, naturally in X; 

4. for every X e |C], there is =x € S(X x X) such that for all ge S(X x X), 
T < (S(idx,idx))(9) iff =x < ¢. 


If additionally 
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rev: A IE @:PA T,x: At ¢: prop TH, £: Refs I by vu: CType(S) 


I+ $(v): prop Tt a.¢:PA I} £= v: prop 
It @:PA T Er A T Hew: A 
I+ Q¢: prop ee I } v = w: prop 


I H} ġ: prop I} w: prop 
(ce {T,1}) Peso (Se {av= 4}) 


I} c: prop 


Fig. 4: Term formation rules for the higher order separation logic. 


5. given X €|C|, SX is a Bl-algebra, i.e. a commutative monoid equipped with 
a right order-adjoint to multiplication; 
6. given fe C(X,Y), Sf: SY — SX is a Bl-algebra morphism, 


then S is called a first order Bl-hyperdoctrine. 

In a (higher order) hyperdoctrine, C is additionally required to be Cartesian 
closed and every SX is required to be poset-isomorphic to C(X, A) for a suitable 
internal Heyting algebra A € |C| naturally in X. Such a hyperdoctrine is a 
Bl-hyperdoctrine if moreover A is an internal BI-algebra. 


Proposition 12. Every internally complete Heyting algebra A in a Cartesian 
closed category C with finite limits gives rise to a canonical hyperdoctrine C(-, A): 
for every X, C(X, A) is a poset under f <g iff fage=f. 


Proof. Clearly, every C(X, A) is a Heyting algebra and every C( f, A) is a Heyting 
algebra morphism. The quantifies are defined mutually dually as follows: 


Da XxY>X $, 
(VY)x($:XxY—>A)=/\ d. 


fst: Xx YX 


(Y)x(@: X xY > A)= 


Naturality in X follows from the corresponding Beck-Chevalley conditions. 
Finally, internal equality =x: X x X — A is defined as V ( yt. a 


idx idx 


A standard way to obtain an (internally) complete Bl-algebra is to resort to 
ordered partial commutative monoids [18]. 


Definition 13 (Ordered PCM [18]). An ordered partial commutative monoid 
(pcm) is a tuple (M,€,-,<) where M is a set, E & M is a set of units, multipli- 
cation - is a partial binary operation on M, and < is a preorder on M, satisfying 
an number of axioms (see [18] for details). 


We note that using general recipes [3], for every internal ordered pcm M ina 
topos C with subobject classifier 2, C(- x M, 2) forms a BI-hyperdoctrine, on 
particular, if C = Set then Set(- x M, 2) is a BI-hyperdoctrine. 
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6 A Higher Order Logic for Full Ground Store 


We proceed to develop a local version of separation logic using semantic principles 
explored in the previous sections. That is, we seek an interpretation for the 
language in Fig. 4 in the category [W, Set] over the type system (1), extended 
with predicate types PA. The judgements I} ¢: prop type formulas depending 
on a variable context I’. Additionally, we have judgements of the form I’ H ¢: PA 
for predicates in context. Both kinds of judgements are mutually convertible using 
the standard application-abstraction routine. Note that expressions for quantifiers 
da. ġ are thus obtained in two steps: by forming a predicate x. ġ, and subsequently 
applying 3. Apart from the standard logical connectives, we postulate separating 
conjunction x and separating implication —. 

Our goal is to build a Bl-hyperdoctrine, using the recipes, summarized in 
the previous section. That is, we construct a certain internal Bl-algebra O in 
[W, Set], and subsequently conclude that [-, O] is a BI-hyperdoctrine in question. 
In what follows, most of the effort is invested into constructing an internally 
complete Boolean algebra P o (PH) (hence [-,P o (PH)] is a hyperdoctrine), 
from which O is carved out as a subfunctor, identified by an upward closure 
condition. Here, P is a contravariant powerset functor, and P and Å are certain 
modifications of the hiding and the heap functors from Section 4. As we shall 
see, the move from P o (PH) to © remedies the problem of the former that the 
natural separation conjunction operator * on it does not have unit (Remark 19). 

In order to model resource separation, we must identify a domain of logical 
assertions over partial heaps, i.e. heaplets, instead of total heaps. We thus need 
to derive a unary (covariant) heaplet functor from the binary, mix-variant one H 
used before. We must still cope not only with heaplets, but with partially hidden 
heaplets, to model information hiding. A seemingly natural candidate functor for 
hidden heaplets is the composition 


P(E 2vs-*") . Set): W? — Set. 


One problem of this definition is that the equivalence relation ~ underlying the 
construction of P (2) is too fine. Consider, for example, ew = (@ © w,*) € 
X wcu H(w’,w). Then (id: w > w,ew) # (inl: w > w @® {*: 1}, ewes: 13), 
i.e. two hidden heaplets would not be equivalent if one extends the other by 
an inaccessible hidden cell. In order to arrive at a more reasonable model of 
logical assertions, we modify the previous model by replacing the category of 
initializations E is a category E of partial initializations. This will induce a hiding 
monad P over [E, Set] using exactly the same formula (2) as for P. 

A partial initialization is a pair (p,7) with p €e W(w;,,w3) and ņ € 
ow cwtep Hw, wy). Let Ê be the category of heap layouts and partial ini- 
tializations. Analogously to u, there is an obvious partial-heap-forgetting functor 
i: Ê —> W. Let H: E — Set be the following heaplet functor: 


Hw = De hes H(w’, w). 
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Given a partial initialization € = (p: w > w',(w” C w'Op,n E€ H(w",w'))): w ~> 
w, He: Hw > Hw’ extends a given heaplet over w to a heaplet over w’ via n: 


(Ñe) (wi S w, € H(w1, w)) = (p[wi] uw” Cw’, n”) 
where 7” € H(p[wi] Uw” Cw’, w’) is as follows 


Proce: s) n” = range(S)(p)(prie, s) 1) ((é: S) € wi) 
Pre: s) q” = Pree: s) ((é: S) E€ w”) 


With Ê and A as above instead of E and H , the framework described in Section 4 
transforms coherently. 


Remark 14. Let us fix a fresh symbol &, and note that 
Hw = ee I. gey range( S) (w) = I sjeu "ange(S)(w) w {B}), 


meaning that the passage from E, H and P to Ê, H and P is equivalent to 
extending the range function with designated values & for inaccessible locations. 
We prefer to think of & this way and not as a content of dangling pointers, to 
emphasize that we deal with a reasoning phenomenon and not with a programming 
phenomenon, for our programs neither create nor process dangling pointers. 


For the next proposition we need the following concrete description of the set 
ii,(2*)w as the end Jp: Set(Xw’,2): this set is a space of dependent 


functions ¢ sending every injection p: w — w’ to a corresponding subset of Xw’, 
and satisfying the constraint: x € (p) iff (X €)(x) € @(te op) for every e: w ~ w”. 


w>w’ew| it 


Proposition 15. The following diagram commutes up to isomorphism: 


[E, Set] = [E, Set]oP 


E| [i 


[w, Sete 7° , Tw, Set]? 


(using the fact that [W, Set]? = [W°?, Set]) where P is the contravariant 
powerset functor P: Set°? = Set and for every X: E — Set the relevant 
isomorphism Bu : ù. (2¥ )w = P(PXw) is as follows: 


(p: w >w, xe Xu’). € Bulo E ù (2*)w) <= re d(p). (5) 


Let us clarify the significance of Proposition 15. The exponential 27 in [E, Set] 
can be thought of as a carrier of Boolean predicates over H, and as we see next 
those form an internally complete Boolean algebra, which is carried from [E, Set] 
to [W, Set] by ii,. The alternative route via Ê and P induces a Boolean algebra 
of predicates over hidden heaplets PH directly in [W, Set]. The equivalence 
established in Proposition 15 witnesses agreement of these two structures. 
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Theorem 16. For every X: E — Set, Po (PX) is an internally complete 
Boolean algebra in [W, Set] under 


(Vo: I+ Bo(PX)) (Je Jw) 
={(p: w > w',x e Xw’), |derw ~w" Fe Iw". 
Fun (t) = J(ûeop) (G) ^ (dwr, (X €)(a))~ € bw (i)}, 
(N4: I+ P o (ÊX)) (Je Jw) 
={(p: w > w',x e Xu’). | Ver w ~ w”, Vie Iw”. 


fun (t) = Fhe op)G) = (idur, (X €)(@))~ € Gwe (i)}. 


for every f: I — J, and the corresponding Boolean algebra operations are com- 
puted as set-theoretic unions, intersections and complements. 


By Theorem 16, we obtain a hyperdoctrine [-,P o (PH)], which provides us with 
a model of (classical) higher order logic in [W, Set]. In particular, this allows 
us to interpret the language from Fig. 4 over [W, Set] excluding the separation 
logic constructs, in such a way that 


[I + ¢: prop]: C > Po (PH), [T + ¢: PA]: 2 x A> Po(PH) 


where I’ = Ai x... An for I = (a1: A1,...,%n: An) where, additionally to the 
standard clauses, PA = Po P(wA x A ). The latter interpretation of predicate 
types PA is justified by the natural isomorphism: 


(P o (PH))* = (a, (2"))* = a, ((2")**) = P o (P(a*X x Al). 


Here, the first and the last transitions are by ® from Proposition 15 and the 
middle one is due to the fact that clearly both (t, (-))* H û*(X x (-)) and 
i, ((-)**) F å (X x (-)). 

Since every set Hw models a heaplet in the standard sense [18], we can 
equip Hw with a standard pointer model structure. 


Proposition 17. For every w € |W], (Hw, {(0 < w,*)},-,<) is an ordered pem 
where for every w € |W], Hw is partially ordered as follows: 


(wi S w, H(wı S we, w)n € H(wi,w)) < (w2 S w, ne H(we,w)) (wi S w2) 


and for wı E w, we S w and m, E H(wi, w), n2 E H(we, w), (wi S w, m): (we S 
w, n2) equals (wy U w2, U N2) if wı A wz = 9, and otherwise undefined. 


As indicated in Section 5, we automatically obtain a BI-algebra structure over 
the set of all subsets of Hw. The same strategy does not apply to PHw, roughly 
because we cannot predict mutual arrangement of hidden partitions of two 
heaplets wrt to each other, for we do not have a global reference space for 
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pointers as contrasted to the standard separation logic setting. We thus define a 
separating conjunction operator directly on every P(PHw) as follows: 


btw ={(p: w > w', (w w wz Cw’, E€ H(wi & wo, w’)))~ | 
(p, (wi E w, H(wi S wi w We, w’)n))~ € Q, 


(p, (we = w, H(we S w W wa, w')n))~ E Y}. 


Lemma 18. The operator *w on P(PHw) satisfies the following properties. 


1. xy is natural in w. 

2. xw is associative and commutative. 

3. (p: w > w, (w” S w',n E H(w",w’)))~ E€ dew Y if and only if there exist 
wi, w2 such that wy w we = w”, (p, (w1 E w, H(w1 S w”, w')n))~ € & and 
(p, (w2 E w, H(we S w",w')n)). € Y. 


Property (3) specifically tells us that any representative of an equivalence class 
contained in a separating conjunction can be split in such a way that the respective 
pieces belong to the arguments of the separating conjunction. 


Remark 19. The only candidate for the unit of the separating conjunction *, 
would be the emptiness predicate empty,,: 1 > P(PH w), identifying precisely 
the empty heaplets. However, empty,,, is not natural in w. In fact, it follows by 
Yoneda lemma that there are exactly two natural transformations 1 > P o (PH ), 
which are the total truth and the total false, none of which is a unit for x*w. 


Remark 19 provides a formal argument why we cannot interpret classical sepa- 
ration logic over P o (PH). We thus proceed to identify for every w a subset of 
P(PHw), for which the total truth predicate becomes the unit of the separating 
conjunction. Concretely, let © be the subfunctor of P o (PH) identified by the 
following upward closure condition: @ € Ow if 


(o,n~€%,n<7 imply (p) €¢. 


Lemma 20. O is an internal complete sublattice of Po (PH), i.e. the inclusion 
1: O — P o (PH) preserves all meets and all joins. This canonically equips O 
with an internally complete Heyting algebra structure. 


Proof (Sketch). The key idea is to establish a retraction (2, cl) with clos = id. 
The requisite structure is then transferred from P o (PH) to © along it. The 
Heyting implication for O is obtained using the standard formula (¢ => w) = 
V{E | oA £ < Y} interpreted in the internal language. 


Lemma 21. Separating conjunction preserves upward closure: for ¢,w € Ou, 
btw Y = Clu (Q *w Y). 
Lemma 22. O is a Bl-algebra: + is obtained by restriction from P(PHw) by 
Lemma 21, PHw is the unit for it and 
$ +w Y = {(p,n)~ E€ Ow | YP: w > w, m,n € Ñw, m +n defined A 
(pn) ~ (Pm) a (6',n2)~ E9 > (P, m n)~ EY}. 
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— spank. 
—spneoaray if s.p,n-¢ands,pnRy 
- sp, n= pv if s,p,n} =o o s, p,n H y 
— s,p,n = =% if forall (p,7) ~ (6',n') and y < n”, 
s, p', n” H ¢ implies s, p', n" H Y 
— spn ov) if sp, (E Ev v: Alw oLp)s,n) E ¢ 
— s,p, (a,n) =z. if a= (Xp)b and (s,b), p,n = ġ 
—spnelov if n= (w" =w, e Hw", w)) and 
ôl(r: S) = (LP Hv v: CType(S)]w o Lp)s 
where (|I Hv £: Refs], o Lp)s = (r: S) € w” 
ania =a UN bo Apelor S(T a aTr 


for some p’: w > w” 


—s,p,nkeoxy if for suitable wi, w2, n E€ H(wi w w2, w’), 


s, p, (wi S w, H(wi © wi w we,w’)n) = ¢ and 


s, p, (w2 E w, H(we S wi w w2, w')n) Fw 


— s,p,n H= —y if forall (p’,m) ~ (p,n) and for all n2 such that nı -n2 is defined, 


s, p', n2 H (o) implies S, p', n- n F Y 
— s,p,ņnHIġ if I(iieo p)s,id,”,(a,Heon) K ¢ for some e: w ~ w", ae Aw” 


—s,p,nKVo if r(ûeop)s, idur, (a, Ñc on) H ọ for all e: w ~ w”, ae Aw" 


Fig. 5: Semantics of the logic. 


Proof. In view of Lemma 20, we are left to show that the given operations are 
natural and that O is an internal BI-algebra w.r.t. them. Since BI-algebras form 
a variety [5], it suffices to show that each Ow is a BI-algebra. By Lemma 18 (ii), 
it suffices to show that every (-) xw preserves arbitrary joins, for then we can 
use the standard formula to calculate ø ~w w, which happens to be natural in w: 


P >w Y =|] {El 9 *w E< Y}. 


By unfolding the right-hand side, we obtain the expression for w figuring in 
the statement of the lemma. 


Theorem 23. O is an internally complete Heyting BI-algebra, hence |-, O] is a 
BlI-hyperdoctrine. 


Proof. Follows from Lemmas 20 and 22. o 


This now provides us with a complete semantics of the language in Fig. 4 with 
II H- ¢: prop]: £ — O and [T + ¢: PA]: 2 — PA where PA is the upward 
closed subfunctor of P o (P(A x H)), with upward closure only on the H-part, 
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which is isomorphic to O4. The resulting semantics is defined in Fig. 5 where 
we write s,p,7 = ¢ for (p,n)~ € [I F @: prop](s) and s,p,(a,n) H ¢ for 
(p, (a,7))~ € [I H ¢: PA](s). The following properties [4] are then automatic. 


Proposition 24. — (Monotonicity) Ifs, p, = o and n <7’ then s, p, 0 = ġ. 
— (Shrinkage) If s, p,n =|= ¢, n < n and 1 contains all cells reachable from s 
and w then s,p,n' E @ 


7 Examples 


Let us illustrate subtle features of our semantics by some examples. 


Example 25. Consider the formula 3@: Refi, . l — 5 from the introduction in 
the empty context -. Then -,p,7 H 34.4 — 5 iff for some e: w ~> w”, and 
some x € Refmw”, £, idy”,(He)n K & © 5. The latter is true iff pr,((He)n) = 5. 
Note that w’ may not contain £ and it is always possible to choose e€ so that w” 
contains l and pr,,((He)n) = 5. Hence, the original formula is always valid. 


Example 26. The clauses in Fig. 5 are very similar to the standard Kripke 
semantics of intuitionistic logic. Note however, that the clause for implication 
strikingly differs from the expected one 


— spn o> if forall n <7, s,p,7' H ¢ implies s,p,7' H Y, 
though. The latter is indeed not validated by our semantics, as witnessed by the 
following example. Consider the following formulas ¢ and w respectively: 
L: Refren, H Jf. 40.0 l a l > r: prop (6) 
L: Refrefi,, H I.L Ll a l a 6: prop (7) 


The first formula is valid over heaplets, in which £ refers to a reference to some 
integer, while the second one is only valid over heaplets, in which £ refers to a 
reference to 6. Any 7 > n = (idw, ({0"} = {4,2}, [P — 6])) satisfies both (6) 
and (7) or none of them. However, the implication ¢ = w still is not valid over 7 
in our semantics, for 


n~ (w= wl: Int), AC, l} S {60,0}, [0 > 5, 2" 6])) 
<(wow@(l: Int), {60,03 c {60,0}, [E= 0,8 = 5,2" > 6))) 


and the latter heaplet validates @ but not w. 


Example 27. Least u and greatest v fixpoints can be encoded in higher order 
logic [2]. As an example, consider 


isList = uy. l. L —> null v W, x.l —> (2,0) * (0), 


which specifies the fact that £ is a pointer to a head of a list (eliding coproduct 
injections in inl null and inr(a, ’)). By definition, isList satisfies the following 
recursive equation: 


isList(£) = £ null v L, x. L —> (x, l) x isList(C’) 
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Let us expand the semantics of the right hand side. We have 


Je: Refiist, sList: P(Reflis.) HK l > null v W, x.l (2, l) x isList(l)]w(isList) 
= {(p: w > w', (Refisp) (£), 8 € Hw’)~ | pry (ô) = null}u 
|e: Refiist, isList: P(Refist) = 40’, x. £ > (x, l) x isList(l’)] w(isList) 
= {(p: w+ w’, (Refia p) (8,8 € Ñw) ~ | 
Pro (ô) = null v W, x. proe ô= (x, l) A (9, 0,5 \ pE) € isList} 


where 6 \ p(£) denotes the ô with the cell p(€) removed. In summary, (p: w > 
w’, (Refisp)(£), 8 € Hw’)~ is in [¢: Refist, isList: P(Refist) H isList(€)](isList) 
if and only if either pr,() 6 = null or there exists an l’ € w such that pr (e) 6 = 
(a, &) and (p, l, \ p(€))~ € isList. 


8 Conclusions and Further Work 


Compositionality is an uncontroversial desirable property in semantics and rea- 
soning, which admits strikingly different, but equally valid interpretations, as 
becomes particularly instructive when modelling dynamic memory allocation. 
From the programming perspective it is desirable to provide compositional means 
for keeping track of integrity of the underlying data, in particular, for preventing 
dangling pointers. Reasoning however inherently requires introduction of partially 
defined data, such as heaplets, which due to the compositionality principle must 
be regarded as first class semantic units. 

Here we have made a step towards reconciling recent extensional monad- 
based denotational semantic for full-ground store [9] with higher order categorical 
reasoning frameworks [2] by constructing a suitable intuitionistic BI-hyperdoctrine. 
Much remains to be done. A highly desirable ingredient, which is currently missing 
in our logic in Fig. 4 is a construct relating programs and logical assertions, such 
as the following dynamic logic style modality 


Tee p: A [t+ @:PA 
I+ [p]¢: prop 


which would allow us e.g. in a standard way to encode Hoare triples {é}p{w} as 
implications ¢ = [p]v. This is difficult due to the outlined discrepancy in the 
semantics for construction and reasoning. The categories of initializations for p 
and ¢ and the corresponding hiding monads are technically incompatible. In 
future work we aim to deeply analyse this phenomenon and develop a semantics 
for such modalities in a principled fashion. 

Orthogonally to these plans we are interested in further study of the full ground 
store monad and its variants. One interesting research direction is developing 
algebraic presentations of these monads in terms of operations and equations [17]. 
Certain generic methods [13] were proposed for the simple store case (Example 3), 
and it remains to be seen if these can be generalized to the full ground store case. 
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Abstract. Inductive datatypes in programming languages allow users 
to define useful data structures such as natural numbers, lists, trees, and 
others. In this paper we show how inductive datatypes may be added to 
the quantum programming language QPL. We construct a sound cate- 
gorical model for the language and by doing so we provide the first de- 
tailed semantic treatment of user-defined inductive datatypes in quantum 
programming. We also show our denotational interpretation is invariant 
with respect to big-step reduction, thereby establishing another novel 
result for quantum programming. Compared to classical programming, 
this property is considerably more difficult to prove and we demonstrate 
its usefulness by showing how it immediately implies computational ade- 
quacy at all types. To further cement our results, our semantics is entirely 
based on a physically natural model of von Neumann algebras, which are 
mathematical structures used by physicists to study quantum mechanics. 


Keywords: Quantum programming - Inductive types - Adequacy 


1 Introduction 


Quantum computing is a computational paradigm which takes advantage of 
quantum mechanical phenomena to perform computation. A quantum computer 
can solve problems which are out of reach for classical computers (e.g. factori- 
sation of large numbers [24], solving large linear systems [8]). The recent de- 
velopments of quantum technologies points out the necessity of filling the gap 
between theoretical quantum algorithms and the actual (prototypes of) quan- 
tum computers. As a consequence, quantum software and in particular quantum 
programming languages play a key role in the future development of quantum 
computing. The present paper makes several theoretical contributions towards 
the design and denotational semantics of quantum programming languages. 
Our development is based around the quantum programming language QPL 
[23] which we extend with inductive datatypes. Our paper is the first to construct 
a denotational semantics for user-defined inductive datatypes in quantum pro- 
gramming. In the spirit of the original QPL, our type system is affine (discarding 
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of arbitrary variables is allowed, but copying is restricted). We also extend QPL 
with a copy operation for classical data, because this is an admissible operation 
in quantum mechanics which improves programming convenience. The addition 
of inductive datatypes requires a departure from the original denotational se- 
mantics of QPL, which are based on finite-dimensional quantum structures, and 
we consider instead (possibly infinite-dimensional) quantum structures based on 
W*-algebras (also known as von Neumann algebras), which have been used by 
physicists in the study of quantum foundations [25]. As such, our semantic treat- 
ment is physically natural and our model is more accessible to physicists and 
experts in quantum computing compared to most other denotational models. 

QPL is a first-order programming language which has procedures, but it does 
not have lambda abstractions. Thus, there is no use for a !-modality and we 
show how to model the copy operation by describing the canonical comonoid 
structure of all classical types (including the inductive ones). 

An important notion in quantum mechanics is the idea of causality which 
has been formulated in a variety of different ways. In this paper, we consider a 
simple operational interpretation of causality: if the output of a physical process 
is discarded, then it does not matter which process occurred [I0]. In a symmetric 
monoidal category C with tensor unit J, this can be understood as requiring that 
for any morphism (process) f : A; — Ag, it must be the case that o4,0f =4,, 
where o4, : A; + I is the discarding map (process) at the given objects. This 
notion ties in very nicely with our affine language, because we have to show that 
the interpretation of values is causal, i.e., values are always discardable. 

A major contribution of this paper is that we prove the denotational seman- 
tics is invariant with respect to both small-step reduction and big-step reduction. 
The latter is more difficult in quantum programming and our paper is the first 
to demonstrate such a result. As a corollary, we obtain computational adequacy. 


2 Syntax of QPL 


The syntax of QPL (including our extensions) is summarised in Figure[1] A well- 
formed type context, denoted | O, is simply a list of distinct type variables. A 
type A is well-formed in type context O, denoted O F A, if the judgement can be 
derived according to the following rules (see for a more detailed exposition): 


tO tO -O OHA OFB . {4,0} O,X-LA 
OHO: OHI OF qbit OFAxB i OF uX.A 


A type A is closed if - A. Note that nested type induction is allowed. Hence- 
forth, we implicitly assume that all types we are dealing with are well-formed. 


Example 1. The type of natural numbers is defined as Nat = X.I + X. Lists 
of a closed type -+ A are defined as List(A) = pY T+ A QY. 


Notice that our type system is not equipped with a !-modality. Indeed, in the 
absence of function types, there is no reason to introduce it. Instead, we specify 
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Types A,B :=X|I|qbit|A+B|A®B|ux.A 
Classical Types P,R :=X|I|PHR|POR]|uxX.P 
Terms M,N ::= new unit u | discard x | y = copy z | new qbit q | 


b = measure q | qi,...,gn *= S | M;N | skip | 
while b do M | x = lefta, B M | x = rights. | 
case y of {left xı > M | right z2 > N} | 
x = (x1, £2) | (x1, £2) = z | y = fold x | y = unfold z | 
proc f: c:A>y:B{M} |y= f(x) 

Variable contexts I, X ::= £1 : Á1,..., Zn : Ån 

Procedure contexts M n= fı : A > Bi,..., fn: An > Bn 


IT+ (T) new unit u (T,u: I) HI} (1a: A) discard x (T) 


P is a classical type 
I+ ({T,x : P) y = copy z (T,< : P,y: PY I+ (I) skip (T) 


ITH (LP) M (T^ IT+ (T^ N (5) 
ITE (T) M;N (3) 
I} (T,b: bit) M (Ib: bit) 
TIF (T,b: bit) while b do M (T,b : bit) 


I H (T) new qbit q (T,q : qbit) I- (T,q : qbit) b = measure q (T, b : bit) 
S is a unitary of arity n 


H b (T,qı : qbit,...,qn : qbit) q1,..., qn *= S (T, qı : qbit,...,qn : qbit) 


H} (T,x : A) y = lefta, s x (T, y: A+ B) 


T+ (T,x: B) y= right; g x (T,y: A+B) 
IO (2,21: A) My (5) H H (1,22: B) M2 (5) 
I} (T,y: A+ B) case y of {lefta.p x1 > Mı | right, g z2 > M2 } (X) 


I+ (T,xı : A, x2 : B) x = (z1, £2) (T, £ : AQ B) 


I H (T,x : AQ B) (z1, z2) = x (T, x1 : A, x2 : B) 


H} (Da: A[uX.A/X]) y = foldyx.a x (T,y : X.A) 


IT} (Da: uX.A) y = unfold z (T,y : Alux.A/X]) 
IH, f:A—> Bt (a: A) M (y: B) 
H (I) proc f: x: A—>y: B {M} (T) 


H,f:A—> BF (T,x: A y= f(x) (T,y: B) 


Fig. 1: Syntax and formation rules for QPL terms. 
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the subset of types where copying is an admissible operation. The classical types 
are a subset of our types defined in Figure |1| They are characterised by the 
property that variables of classical types may be copied, whereas variables of 
non-classical types may not be copied (see the rule for copying in Figure fi}. 

We use small Latin letters (e.g. x, y, u, q, b) to range over term variables. More 
specifically, q ranges over variables of type qbit, u over variables of unit type J, b 
over variables of type bit := J+J and x,y range over variables of arbitrary type. 
We use I’ and X to range over variable contexts. A variable context is a function 
from term variables to closed types, which we write as I = z1 : Aj,...,%n: An. 

We use f,g to range over procedure names. Every procedure name f has an 
input type A and an output type B, denoted f : A > B, where A and B are 
closed types. We use IT to range over procedure contexts. A procedure context 
is a function from procedure names to pairs of procedure input-output types, 
denoted H = fi Aj = By,...,Fn ; An => Bn. 


Remark 2. Unlike lambda abstractions, procedures cannot be passed to other 
procedures as input arguments, nor can they be returned as output. 


A term judgement has the form IT + (T) M (X) (see Figure[1} and indicates 
that term M is well-formed in procedure context JI with input variable context 
T and output variable context X. All types occurring within it are closed. 

The intended interpretation of the quantum rules are as follows. The term 
new qbit q prepares a new qubit q in state |0)(0|. The term q,...,dn *= S 
applies a unitary operator S' to a sequence of qubits in the standard way. The 
term b = measure q performs a quantum measurement on qubit q and stores the 
measurement outcome in bit b. The measured qubit is destroyed in the process. 

The no-cloning theorem of quantum mechanics [28] shows that arbitrary 
qubits cannot be copied. Because of this, copying is restricted only to classical 
types, as indicated in Figure|1| and this allows us to avoid runtime errors. Like 
the original QPL [23], our type system is also affine and so any variable can be 
discarded (see the formation rule for the term discard x in Figure (ip. 


3 Operational Semantics of QPL 


In this section we describe the operational semantics of QPL. The central notion 
is that of a program configuration which provides a complete description of the 
current state of program execution. It consists of four components that must 
satisfy some coherence properties: (1) the term which remains to be executed; 
(2) a value assignment, which is a function that assigns formal expressions to 
variables as a result of execution; (3) a procedure store which keeps track of what 
procedures have been defined so far and (4) the quantum state computed so far. 


Value Assignments. A value is an expression defined by the following grammar: 


v, w = * | n | left, Bv | right, pv | (v, w) | fold,x.av 
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where n ranges over the natural numbers. Think of * as representing the unique 
value of unit type J and of n as representing a pointer to the n-th qubit of a 
quantum state p. Specific values of interest are ff := left; zx and tt := right, ;* 
which correspond to false and true respectively. 

A qubit pointer context is a set Q of natural numbers. A value v of type A is 
well-formed in qubit pointer context Q, denoted Q F v : A, if the judgement is 
derivable from the following rules: 


QFv:A QFv:B 
shail {n}F n:qbit QF leftagv: A+B Qrright,,v:A+B 


Qitu:A Qs FwB QiNQ2=2 QFv: A[uX.A/X] 
Qı: Q2F (v, w): AQ B Q F fold, x.av : uX.A 
If v is well-formed, then its type and qubit pointer context are uniquely deter- 
mined. If QF v: P with P classical, then we say v is a classical value. 


Lemma 3. IfQFv:P is a well-formed classical value, then Q =-. 


A value assignment is a function from term variables to values, which we 
write as V = {a1 = v1,..., Zn = Un}, where z; are variables and v; are values. A 
value assignment is well-formed in qubit pointer context Q and variable context 
T, denoted Q; I F V, if V has exactly the same variables as I’, so that I = {x1 : 
A1,...,Zn : An}, and Q = Q1,...,Qn, S-t. Qi F vi : Ai. Such a splitting of Q is 
necessarily unique, if it exists, and some of the Q; may be empty. 


Procedure Stores. A procedure store is a set of procedure definitions, written as: 
Q=tfi : £1 : Á > yı : Bı {Mi}, -fn Hani An + Yn: By {Mn}}. 


A procedure store is well-formed in procedure context M, written IT + Q, if the 
judgement is derivable via the following rules: 
HPR ,f:A—> BE (a: A) M (y: B) 
He T,f:A>BEO,f:a:Aszy:B{M} 


Program Configurations. A program configuration is a quadruple (M | V | 2 | p), 
where M is a term, V is a value assignment, 92 is a procedure store and p € 
C?"*2" is a finite-dimensional density matrix with 0 < tr(p) < 1. The density 
matrix p represents a (mixed) quantum state and its trace may be smaller than 
one because we also use it to encode probability information (see Remark p. 
We write dim(p) = n to indicate that the dimension of p is n. 

A well-formed program configuration is a configuration (M | V | Q | p), 
where there exist (necessarily unique) M, T, X, Q, such that: (1) Zt (T) M (5) 
is a well-formed term; (2) Q;I H V is a well-formed value assignment; (3) 
II + Q is a well-formed procedure store; and (4) Q = {1,2,..., dim(p)}. We 
write I;I; X;Q + (M | V | Q | p) to indicate this situation. The formation 
rules enforce that the qubits of p and the qubit pointers from V are in a 1-1 
correspondence. 
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(new unit u | V | 2 | p) ~ (skip | Vu =* | 2 | p) 


(discard x | V,x =v | Q | p) ~ (skip | ro(V) | Q | tro(p)) 


(y= copy «| V,r=v| 2 | p) ~ (skip | V,c=v,y=v| 2| p) 


(new qbit q | V | 2 | p) ~> (skip | V,q = dim(p) + 1 | 2 | p200) 


(T *=9|V, =m |2 |p) ~ (skip| V, g =m | 2 | Sm(p)) 


(6 = measure q | V,q =m | 2 | p) = (skip | rm(V),0 = ££ | 2 | m(O1PI0),,) 


(b= measure q | V,q =m | 2 | p) => (skip | rm (V), b= tt | 2 | w(lpll),,) 


(PIVI2 lp) (P' LV’) 2] 2’) 
(skip; P| V | 2 |p) ~~ (P|V | 2] p) (P;Q|V | 2| p)~ P5Q|V'| "| p+) 


(while b do M | V,b= ff | 2 | p) ~ (skip | V,b = ff | 2 | p) 


(while b do M | V,b=tt | 2 | p) ~ (M; while b do M | V,b=tt | Q | p) 


(y = left xz | V,z =v | Q | p) ~ (skip | V, y = left v | 2 | p) 


W = right x | V,e=0| 2 | p) ~ (skip | V,y = right v | 2 | p) 


(case y of {left x1 > Mı | right x2 > Mə } | V,y = left v | 2 | p) ~ (Mı | V, zı =v | 2| p) 


(case y of {left zı > Mı | right z2 > Mə } | V, y = right v | 2 | p) ~ (M2 | V, x2 =v] 2 | p) 


(z = (x1, z2) | V, £1 = v1, £2 = v2 | Q | p) ~ (skip | V, z = (v1, v2) | Q | p) 


((t1,@2) =x | V, £ = (v1, v2) | 2 | p) ~ (skip | V, £1 = v1, £2 = v2 | 2 | p) 


(y = fold «| V,z = v | Q | p) ~ (skip | V, y = fold v | 2 | p) 


(y = unfold xz | V, x = fold v | 2 | p) ~ (skip | V.y=v| 2 |p) 


(proc f: x: A—>y:B{M}|V|Q]|p)~ (skip| V| 2, f: x: A—>y:B{M}]|p) 


(u = f(t1) | Vier =v | 2, f: z2: A> yz: B {M} |p) ~ 
(Ma | V, zı =v | 2, f = x2: A> y2: B{M} |p) 


Fig. 2: Small Step Operational semantics of QPL. 
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The small step semantics is defined for configurations (M | V | Q | p) by 
induction on M in Figure 2] and we now explain the notations used therein. 

In the rule for discarding, we use two functions that depend on a value v. 
They are try, which modifies the quantum state p by tracing out all of its qubits 
which are used in v, and r, which simply reindexes the value assignment, so that 
the pointers within r,(V) correctly point to the corresponding qubits of tr,(p), 
which is potentially of smaller dimension than p. Formally, for a well-formed 
value v, let Q and A be the unique qubit pointer context and type, such that 
Qt v:A. Then tr,(p) is the quantum state obtained from p by tracing out all 
qubits specified by Q. Given a value assignment V = {z1 = v1,...%n = Un}, 


then r,(V) = {z1 = r! (v1), ..., £n = Ti (Un)}, where: 
*, ifw=* 
k-ieQ|i<k}|, ifw=keEN 
j left r! (w'), if w = left w’ 

ry(w)=4 a f 
right r/,(w’), if w = right w’ 
(ra (w1), ro (wa) if w = (w1, w2) 
fold r’,(w’), if w = fold w’ 


In the rule for unitaries, the superoperator Sp} applies the unitary S to the 
vector of qubits specified by 77. In the rules for measurement, the m-th qubit of 
p is measured in the computational basis, the measured qubit is destroyed in the 
process and the measurement outcome is stored in the bit b. More specifically, 
|i) m = Lgm-1 8 |i) Q I2n-m and m(i| is its adjoint, for i € {0,1}, and where Jn is 
the identity matrix in C”*”. 


Remark 4. Because of the way we decided to handle measurements, reduction 
(— ~ —) is a nondeterministic operation, where we encode the probabilities 
of reduction within the trace of our density matrices in a similar way to [9]. 
Equivalently, we may see the reduction relation as probabilistic provided that we 
normalise all density matrices and decorate the reductions with the appropriate 
probability information as specified by the Born rule of quantum mechanics. 
The nondeterministic view leads to a more concise and clear presentation and 
because of this we have chosen it over the probabilistic view. 


The introduction rule for procedures simply defines a procedure which is 
added to the procedure store. In the rule for calling procedures, the term Ma 
is a-equivalent to M and is obtained from it by renaming the input x2 to 71, 
renaming the output y2 to yı and renaming all other variables within M to some 
fresh names, so as to avoid conflicts with the input, output and the rest of the 
variables within V. 


Theorem 5 (Subject reduction). If 17;T;2;Q + (M |V | Q |p) and 
(M|V[2| p) (M |V| |p), then IIS X; Q'E (M |V| 2" | ø), 


for some (necessarily unique) contexts II', I’, Q! and where X is invariant. 


Assumption 6. From now on we assume all configurations are well-formed. 
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(M |b=tt]|- |1) 
while b do { we 4 
new: gbit d; (M|b=tt|- |0.5) (skip |b= ff] - | 0.5) 
q *= H; P ` 
discard b; * a 
= neasure ĝ (M |b=tt |- |0.25) (skip|b=ff]| - | 0.25) 
} Ps 4 
f (skip | b= ff | - | 0.125) 


(a) A term M 
(b) A reduction graph involving M 


Fig. 3: Example of a term and of a reduction graph. 


A configuration (M | V | 2 | p) is said to be terminal if M = skip. Program 
execution finishes at terminal configurations, which are characterised by the 
property that they do not reduce any further. We will use calligraphic letters 
(C,D,...) to range over configurations and we will use 7 to range over terminal 
configurations. For a configuration C = (M | V | Q | p), we write for brevity 
tr(C) := tr(p) and we shall say C is normalised whenever tr(C) = 1. We say that 
a configuration C is impossible if tr(C) = 0 and we say it is possible otherwise. 


Theorem 7 (Progress). If C is a configuration, then either C is terminal or 
there exists a configuration D, such that C ~~ D. Moreover, if C is not terminal, 
then tr(C) = X cp tr(D) and there are at most two such configurations D. 


In the situation of the above theorem, the probability of reduction is given 
by Pr(C ~ D) := tr(D)/tr(C), for any possible C (see Remark|4) and Theorem [7] 
shows the total probability of all single-step reductions is 1. If C is impossible, 
then C occurs with probability 0 and subsequent reductions are also impossible. 


Probability of Termination. Given configurations C and D let Seq,(C,D) := 
{Co ~> +++ ~> Cn| Co = C and Cn = D}, and let Seqe,,(C,D) = Uj Sea, (C, D). 
Finally, let TerSeq<n (C) = Ur terminal S€d<n (C, T). In other words, TerSeq<,,(C) 
is the set of all reduction sequences from C which terminate in at most n 
steps (including 0 if C is terminal). For every terminating reduction sequence 
r= (C~ -~~ T), let End(r) := 7, ie. End(r) is simply the (terminal) end- 
point of the sequence. 

For any configuration C, the sequence (Seen tes tr(End(r))) is in- 


nen 


creasing with upper bound tr(C) (follows from T heorem [7). For any possible C, 
we define: 


Halt(C) = \/ XO tr(End(r))/tr(C) 


n=0 reTerSeq<,, (C) 


which is exactly the probability of termination of C. This is justified, because 
Halt(7) = 1, for any terminal (and possible) configuration 7 and Halt(C) = 


X esp. Pr(C ~ D)Halt(D). We write ~, for the transitive closure of ~. 
D possible 
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(1 = GHZ(n) |n=s(s(s(zero))) | Q | 1) 


proc GHZnext :: 1: ListQ -> 1: ListQ { 3 
new qbit q; . 
8 ee (1 = GHZnext(1) |1=2::1::nil | Q |72) 
nil -> q*=H; $ 
l=q:: nil . ee a re eee 
| @ i: D > @?,q == CNOT; (new qbit q; [| L=2::1::nil | 2 | 4) 
Sige ue ge iI $ 
} (case 1 of --- |1=2:: 1: nil,q=3 | 2 |28 |0)(0}) 
proc GHZ :: n : Nat -> 1 : ListQ { i 
case n of (q’?,q *=CNOT;--- | 1? =1::nil,q=3,q’? =2 | 2 | %2 @|0)(0)) 
zero -> 1 = nil 3 
| s(n?) -> 1 = GHZnext (GHZ(n’)) 
} (esq rio? fr 1? |P= ir nil,q=—3,q? = 2 || 2's) 
(a) Procedures for generating . 
GHZ,,. (skip | 1 =3::2::1::nil| Q| 43) 


(b) A reduction sequence producing GHZ3. 


Fig. 4: Example with lists of qubits and a recursive procedure. 


Example 8. Consider the term M in Figure [8] The body of the while loop 
has the effect of performing a fair coin toss (realised through quantum measure- 
ment in the standard way) and storing the outcome in variable b. Therefore, 
starting from configuration C = (M | b = tt | - | 1), as in Subfigure [3b] the pro- 
gram has the effect of tossing a fair coin until ff shows up. The set of terminal 
configurations reachable from C is {(skip | b = ff | - | 2~*) | i € N>1} and the 
last component of each configuration is a 1 x 1 density matrix which is exactly the 


probability of reducing to the configuration. Therefore Halt(C) = XZ; 27' = 1. 


Example 9. The GHZ, state is defined as yn := (|0)°” +|1)°”)((0|?” + 4|®”) /2. 
In Figure |4| we define a procedure GHZ, which given a natural number n, gen- 
erates the state yn, which is represented as a list of qubits of length n. The 
procedure uses an auxiliary procedure GHZnext, which given a list of qubits 
representing the state yn, returns the state 7,1, again represented as a list of 
qubits. The two procedures make use of some (hopefully obvious) syntactic sugar. 
In [4b] we also present the last few steps of a reduction sequence which produces 
73 starting from configuration (1 = GHZ(n) |n = s(s(s(zero))) | 2 | 1), where 
N contains the above mentioned procedures. In the reduction sequence we only 
show the term in evaluating position and we omit some intermediate steps. The 
type ListQ is a shorthand for List(qbit) from Example [1] 


4 W*-algebras 


In this section we describe our denotational model. It is based on W*-algebras, 
which are algebras of observables (i.e. physical entities), with interesting domain- 
theoretic properties. We recall some background on W*-algebras and their cat- 
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egorical structure. We refer the reader to for an encyclopaedic account on 
W*-algebras. 


Domain-theoretic Preliminaries. Recall that a directed subset of a poset P is 
a non-empty subset X C P in which every pair of elements of X has an upper 
bound in X. A poset P is a directed-complete partial order (dcpo) if each directed 
subset has a supremum. A poset P is pointed if it has a least element, usually 
denoted by L. A monotone map f : P + Q between posets is Scott-continuous if 
it preserves suprema of directed subsets. If P and Q are pointed and f preserves 
the least element, then we say f is strict. We write DCPO (DCPO,,) for the 
category of (pointed) dcpo’s and (strict) Scott-continuous maps between them. 


Definition of W*-algebras. A complex algebra is a complex vector space V 
equipped with a bilinear multiplication (— -—) : V x V —> V, which we write 
as juxtaposition. A Banach algebra A is a complex algebra A equipped with a 
submultiplicative norm || — || : A > Rso, ie. Vz,y € A: |lzy|| < |[zl|lyl|. A 
*-algebra A is a complex algebra A with an involution (—)* : A > A such that 
(2*)* = x, (£ +y)* = (a* +y*), (cy)* = y*a* and (Ar)* = Az*, for x,y € A 
and à € C. A C*-algebra is a Banach *-algebra A which satisfies the C*-identity, 
ie. |la*a|| = |x|? for all a € A. A C*-algebra A is unital if it has an element 
1 € A, such that for every x € A: zl = lx = x. All C*-algebras in this paper 
are unital and for brevity we regard unitality as part of their definition. 


Example 10. The algebra M,,(C) of n x n complex matrices is a C*-algebra. 
In particular, the set of complex numbers C has a C*-algebra structure since 
M,(C) S C. More generally, the n x n matrices valued in a C*-algebra A also 
form a C*-algebra M,,(A). The C*-algebra of qubits is qbit := Mə(C). 


An element x € A of a C*-algebra A is called positive if dy € A: x = y*y. 
The poset of positive elements of A is denoted At and its order is given by 
x < y iff (y — x) € At. The unit interval of A is the subposet [0,1]4 C At of 
all positive elements x such that 0 < x < 1. 

Let f : A —> B be a linear map between C*-algebras A and B. We say 
that f is positive if it preserves positive elements. We say that f is completely 
positive if it is n-positive for every n € N, ie. the map M,(f) : M,(A) > 
M,,(B) defined for every matrix [2j,;]1<i,j;<n E Mn(A) by Mn(f) ([tijli<ij<n) = 
[f (£i j)l1<ij<n is positive. The map f is called multiplicative, involutive, unital 
if it preserves multiplication, involution, and the unit, respectively. The map f 
is called subunital whenever the inequalities 0 < f(1) < 1 hold. A state on a 
C*-algebra A is a completely positive unital map s : A > C. 

Although W*-algebras are commonly defined in topological terms (as C*- 
algebras closed under several operator topologies) or equivalently in algebraic 
terms (as C*-algebras which are their own bicommutant), one can also equiva- 
lently define them in domain-theoretic terms [I9], as we do next. 

A completely positive map between C*-algebras is normal if its restriction 
to the unit interval is Scott-continuous [19] Proposition A.3]. A W*-algebra is a 
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C*-algebra A such that the unit interval [0,1], is a dcpo, and A has a separating 
set of normal states: for every x € At, if x # 0, then there is a normal state 
s : A —> C such that s(x) #0 B5] Theorem III.3.16]. 

A linear map f : A > B between W*-algebras A and B is called an NCPSU- 
map if f is normal, completely positive and subunital. The map f is called an 
NMIU-map if f is normal, multiplicative, involutive and unital. We note that 
every NMIU-map is necessarily an NCPSU-map and that W*-algebras are closed 
under formation of matrix algebras as in Example 


Categorical Structure. Let Wxcpgy be the category of W*-algebras and NCPSU- 
maps and let Wyygy be its full-on-objects subcategory of NMIU-maps. Through- 
out the rest of the paper let C = (WX cpgy)°? and let V := (Wyyqu)°?- QPL 
types are interpreted as functors [O + A] : V!°! > V and closed QPL types as 
objects [A] € Ob(V) = Ob(C). One should think of V as the category of val- 
ues, because the interpretation of our values from §3]are indeed V-morphisms. 
General QPL terms are interpreted as morphisms of C, so one should think of 
C as the category of computations. We now describe the categorical structure of 
V and C and later we justify our choice for working in the opposite categories. 

Both C and V have a symmetric monoidal structure when equipped with 
the spatial tensor product, denoted here by (— @ —), and tensor unit I := C 
Section 10]. Moreover, V is symmetric monoidal closed and also complete and 
cocomplete [II]. C and V have finite coproducts, given by direct sums of W*- 
algebras [2] Proposition 4.7.3]. The coproduct of objects A and B is denoted 
by A + B and the coproduct injections are denoted left,4,3 : A > A+B and 
right, p : B > A + B. Given morphisms f : A + C and g : B > C, we write 
[f, g] : A+ B > C for the unique cocone morphism induced by the coproduct. 
Moreover, coproducts distribute over tensor products [2] §4.6]. More specifically, 
there exists a natural isomorphism d4, B,c : AQ (B+C) —>(A8 B)+(48C) 
which satisfies the usual coherence conditions. The initial object in C is moreover 
a zero object and is denoted 0. The W*-algebra of bits is bit = 1+ I=C@C. 

The categories V, C and Set are related by symmetric monoidal adjunctions: 


F Í 
St 1 °v, ıı °C [26] pp. 11] 
G R 


and the subcategory inclusion J preserves coproducts and tensors up to equality. 

Interpreting QPL within C and V is not an ad hoc trick. In physical terms, 
this corresponds to adopting the Heisenberg picture of quantum mechanics and 
this is usually done when working with infinite-dimensional W*-algebras (like 
we do). Semantically, this is necessary, because (1) our type system has condi- 
tional branching and we need to interpret QPL terms within a category with 
finite coproducts; (2) we have to be able to compute parameterised initial al- 
gebras to interpret inductive datatypes. The category Wyopsy has finite prod- 
ucts, but it does not have coproducts, so by interpreting QPL terms within 
C = (Wxepsy)°? we solve problem (1). For (2), the monoidal closure of V = 
(Wxaru)°? is crucial, because it implies the tensor product preserves w-colimits. 
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tr: Mn(C) + C | new, : C > Mə» (C) | meas: M2(C) > C@C | unitary, : Mən (C) + Mə» (C) 
tr: AH SO, Ait | newp :: aH ap meas :: C r) = (a d) unitary, :: A œ> SASİ 
tr? : C+ Mn (C) | new} : Mon (C) > C | meast : C@ C —> M2(C) | unitary) : Man (C) —> Mən (C) 


tri nam aln new} :: A++ tr(Ap) | meast :: (a d) > h a) unitary) :: AW SAS 


Fig.5: A selection of maps in the Schrödinger picture (f : A > B) and their 
Hermitian adjoints (ft : B + A) used in the Heisenberg picture. 


Convex Sums. In both C and WĂcpsu, morphisms are closed under convex 
sums, which are defined pointwise, as usual. More specifically, given NCPSU- 
maps fi,---, fn : A — B and real numbers p; € [0,1] with X2; p; < 1, then the 
map >>, pifi : A — B is also an NCPSU-map. 


Order-enrichment. For W*-algebras A and B, we define a partial order on 
C(A, B) by : f < g iff g — f is a completely positive map. Equipped with 
this order, our category C is DCPO -enriched [3] Theorem 4.3]. The least el- 
ement in C(A, B) is also a zero morphism and is given by the map 0 : A > B, 
defined by O(a) = 0. Also, the coproduct structure and the symmetric monoidal 
structure are both DCPO -enriched [2] Corollary 4.9.15] [B] Theorem 4.5]. 


Quantum Operations. For convenience, our operational semantics adopts the 
Schrödinger picture of quantum mechanics, which is the picture most experts in 
quantum computing are familiar with. However, as we have just explained, our 
denotational semantics has to adopt the Heisenberg picture. The two pictures are 
equivalent in finite dimensions and we will now show how to translate from one 
to the other. By doing so, we provide an explicit description (in both pictures) 
of the required quantum maps that we need to interpret QPL. 

Consider the maps in Figure|5| The map tr is used to trace out (or discard) 
parts of quantum states. Density matrices p are in 1-1 correspondence with the 
maps new,, which we use in our semantics to describe (mixed) quantum states. 
The meas map simply measures a qubit in the computational basis and returns 
a bit as measurement outcome. The unitarys map is used for application of a 
unitary S. These maps work as described in the Schrédinger picture of quantum 
mechanics, i.e., the category Wycopgy- For every map f : A + B among those 
mentioned, ft : B — A indicates its Hermitian adjoint |°} In the Heisenberg 
picture, composition of maps is done in the opposite way, so we simply write 
ft = (ft)? € C(A, B) for the Hermitian adjoint of f when seen as a morphism 
in (Wxcpsyu)°? = C. Thus, the mapping (—)* translates the above operations 
from the Schrödinger picture (the category Wycpsy) to the Heisenberg picture 
(the category C) of quantum mechanics. 


3 This adjoint exists, because A and B are finite-dimensional W*-algebras which there- 
fore have the structure of a Hilbert space when equipped with the Hilbert-Schmidt 
inner product [27] pp. 145]. 
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Parameterised Initial Algebras. In order to interpret inductive datatypes, we 
need to be able to compute parameterised initial algebras for the functors in- 
duced by our type expressions. V is ideal for this, because it is cocomplete and 
monoidal closed and so all type expressions induce functors on V which preserve 
w-colimits. 


Definition 11 (cf. [6] §6.1]). Given a category A and a functor T : A" > A, 
with n > 1, a parameterised initial algebra for T is a pair (T*, oT), such that: 


— TË : A"! A is a functor; 

— ¢' :To(Id,T*) > Tt: A"! = A is a natural isomorphism; 

— For every A € Ob(A"~?), the pair (T#A, 64) is an initial T(A, —)-algebra. 
Proposition 12. Every w-cocontinuous functor T : V” — V has a parame- 
terised initial algebra (T*, 7) with TË : V”! > V being w-cocontinuous. 


Proof. V is cocomplete, so this follows from [I3] §4.3]. 


5 Denotational Semantics of QPL 


In this section we describe the denotational semantics of QPL. 


5.1 Interpretation of Types 


The interpretation of a type O + A is a functor [O+ A] : V!°! > V, defined 
by induction on the derivation of O F- A in Figure [6] As usual, one has to prove 
this assignment is well-defined by showing the required initial algebras exist. 


Proposition 13. The assignment in Figure [6] is well-defined. 


Proof. By induction, every [O + A] is an w-cocontinuous functor and thus it has 
a parameterised initial algebra by Proposition 


Lemma 14 (Type Substitution). Given types O,X + A and OF B, then: 
[OF A[B/X]] = [O, X + A] o Ud, [OF B)). 


Proof. Straightforward induction. 


For simplicity, the interpretation of terms is only defined on closed types and so 
we introduce more concise notation for them. For any closed type - + A we write 
for convenience [A] := [- + A](*) € Ob(V), where * is the unique object of the 
terminal category 1. Notice also that [A] € Ob(C) = Ob(V). 


Definition 15. Given a closed type - + X.A, we define an isomorphism (in 
V): 


fold, xa: [Alu X.A/X]] = [X F A] [uX.A] S [uX.A] : unfold, x. 
where the equality is Lemma |14] and the iso is the initial algebra structure. 


Example 16. The interpretation of the types from Example |1| are [Nat] = 
@®; oC and [List(A)] = Be, [A]. Specifically, [List (qbit)] = Dg a 
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[ot A]: Viel > v 

[e+ Oi] = M: 

[Obl =Kr 
[O + qbit] = Kapit 
[0 + A+ B] = +o (lO + A], [OF BJ) 
[OF A 8 B] = 8° ([9 - A], [EF BJ) 
[O + uX.A] = [O, X + AJ 


Fig. 6: Interpretations of types. K4 is the constant-A-functor. 


Sue bel pepe pee age 5 ssis 


H, f:A—> Bt (Ia: A y= f(x) T,y : B)) := (r, f) > 


R 
E 
E ( 
E 
F 


J: « : J] idz 
Hn} Hn : qbit] := idqpit 
[Q F lefta, Bv : A + B] = left o [v] 
[QF right, gv : A+ B] := right o [v] 
Q1, Q2 H (v, w) : A8 B] = [e] 8 [u] 
[Q F fold, x.av : uX.A] := fold o [v] 


Fig. 7: Interpretation of values. 


—1 


T,x : A) discard z (T}] := 7 > (r o (id & 0)) 
T, <x: P) y = copy < (1,2: Py: P)] := 7 |> (id @ A) 
a new qbit q (T,q : qbit)] := 7 => ((id @ new) sm) oF r=?) 
Pegi qbit) b = measure q (Fb b: bit)] := m > (id @ meas?) 
Tq: abit) 7 q=S U, q : qbit)] := Smi (id @ unitary$ ) 
- (T) M:N (5)] = 7 = ([N] (m) 0 [M] (7) 
E (D) skip (I°)] = m > id 
T,b : bit) while b do M (I,b: bit)] = 7+ Ifp(Wiaqx)) 
: A) y = left4,g x (T,y : A+ B)] := m |> (id @ lefta, B) 
: B) y = right 4 g x (T,y : A + B}] := 7 > (id 8 right, p) 
: A + B) case y of {left xı > Mı | right z2 > M2} (XY] = 
-> (EMI), [Mal] © d) 
Pa: A, x2 : B) x = (x1,x£2) (T, x : AQ B)] := 7r |> id 
Tia: AQ B) (x1, £2) = x (T, z1 : A, x2 : B)] =a id 
T, x : A[|uX.A/X]) y = fold z (T,y : wX.A)] = 7 > (id & fold) 
T, x: uX.A) y = unfold z (T,y : AluX.A/X])] := 7+ (id & unfold) 
) proc f: x: A> y: B {M} (D}] = r |> id 
(id @ f), 


575 


where r is the right monoidal unit. For simplicity, we omit the monoidal associator. 


Fig. 8: Interpretation of QPL terms. 
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5.2 Copying and Discarding 


Our type system is affine, so we have to construct discarding maps at all types. 
The tensor unit J is a terminal object in V (but not in C) which leads us to the 
next definition. 


Definition 17 (Discarding map). For any W*-algebra A, let o4 : A —> I be 
the unique morphism of V with the indicated domain and codomain. 


We will see that all values admit an interpretation as V-morphisms and are 
therefore discardable. In physical terms, this means values are causal (in the sense 
mentioned in the introduction). Of course, this is not true for the interpretation 
of general terms (which correspond to C-morphisms). 

Our language is equipped with a copy operation on classical data, so we have 
to explain how to copy classical values. We do this by constructing a copy map 
defined at all classical types using results from [34]. 

F 
Proposition 18. Using the categorical data of Set Z 1I ° V , one can 
G 
define a copy map Ajpy : LP] > [P] 9 [P] for every classical type -= P, such 
that the triple ((PI. Ajr: ege) forms a cocommutative comonoid in V. 


We shall later see that the interpretations of our classical values are comonoid 
homomorphisms (w.r.t. Proposition and therefore they may be copied. 


5.3 Interpretation of Terms 


Given a variable context I = xı : A1,..., £n : An, we interpet it as the object 
W] := [4] 8 --- 8 [An] € Ob(C). The interpretation of a procedure context 
I = fi: Ay > Bi,..., fn : An > Bn is defined to be the pointed depo 
I] := C(4A1, Bi) x +--+ x C(An, Bn). A term H F (T) M (X) is interpreted as 
a Scott-continuous function [H H (T) M (2)] : [H] > C([1], [£]) defined by 
induction on the derivation of T+ (T) M (X) in Figure [8] For brevity, we often 
write [M] := [17+ (r) M ()], when the contexts are clear or unimportant. 
We now explain some of the notation used in Figure |8| The rules for ma- 
nipulating qubits use the morphisms newjoy qo meast and unitary{ which are 
defined in §4] For the interpretation of while loops, given an arbitrary mor- 
phism f : A Q bit — A ® bit of C, we define a Scott-continuous endofunction 


W;: C (A8 bit, A® bit) — C(A 8 bit, A 8 bit) 
Wy (9) = [id @ lefty,1, go f o (id Q right, 7)] oda,r7, 
where the isomorphism d4 z, : AQ (I +I) — (A@I)+(A® I) is explained 


in §4| For any pointed dcpo D and Scott-continuous function h : D —> D, its 
least fixpoint is lfp(h) := \V7=_ h'(-L), where L is the least element of D. 


Remark 19. The term semantics for defining and calling procedures does not 
involve any fixpoint computations. The required fixpoint computations are done 
when interpreting procedure stores, as we shall see next. 
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5.4 Interpretation of Configurations 


Before we may interpret program configurations, we first have to describe how 
to interpret values and procedure stores. 


Interpretation of Values. A qubit pointer context Q is interpreted as the ob- 
ject [Q] = qbit®!®l. A value Q F v : A is interpreted as a morphism in V 
[QF v: A] : [Q] > [A], which we abbreviate as [v] if Q and A are clear from 
context. It is defined by induction on the derivation of Q F v : A in Figure 
For the next theorem, recall that if Q v : A is a classical value, then Q = -. 


Theorem 20. Let Qt v: A be a value. Then: 


1. [v] is discardable (i.e. causal). More specifically, oga) © [v] = oq] = ttt. 
2. If A is classical, then |v] is copyable, i.e., Atay © le] = (lvl 8 lvl) ° Ar. 


We see that, as promised, interpretations of values may always be discarded 
and interpretations of classical values may also be copied. Next, we explain how 
to interpret value contexts. For a value context Q; I’ H V, its interpretation is 
the morphism: 


IQ; r- V] = (ic * [Qi] @--- @ [Qu] 22E, irl). 


where Q; + v; : A; is the splitting of Q (see 93) and [T] = [Ai] ®--- @ [An]. 
Some of the Q; can be empty and this is the reason why the definition depends 
on a coherent natural isomorphism. We write [V] as a shorthand for [Q; r + V]. 
Obviously, [V] is also causal thanks to Theorem [20] 


Interpretation of Procedure Stores. The interpretation of a well-formed proce- 
dure store J + 2 is an element of |I], i.e. a |J7|-tuple of morphisms from C. It 
is defined by induction on H F 22: 


L+1=0 
[,f: AS BFR, fua: Asay: B{M}] = ([2], (V2), —-))). 


Interpretation of Configurations. Density matrices p E€ Mə» (C) are in 1-1 corre- 
spondence with Wy cpgy-morphisms new, : C — Mon(C) which are in turn in 
1-1 correspondence with C-morphisms news, : I + qbit®”. Using this observa- 
tion, we can now define the interpretation of a configuration C = (M | V | 2 | p) 
with H;I; X;QH(M | V | Q | p) to be the morphism 


H; T; 2;QF (M|V|2| p)] = 
t2 dim(o) LQTEVI, pyy LTE) M Ie), 11). 


new! 
(1 —* qbi 


For brevity, we simply write |(M | V | 2 | p)] or even just [C] to refer to the 
above morphism. 
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5.5 Soundness, Adequacy and Big-step Invariance 


Since our operational semantics allows for branching, soundness is showing that 
the interpretation of configurations is equal to the sum of small-step reducts. 


Theorem 21 (Soundness). For any non-terminal configuration C : 


[cl = > [P]. 


CD 


Proof. By induction on the shape of the term component of C. 


Remark 22. The above sum and all sums that follow are well-defined convex 
sums of NCPSU-maps where the probability weights p; have been encoded in 
the density matrices. 


A natural question to ask is whether [C] is also equal to the (potentially 
infinite) sum of all terminal configurations that C reduces to. In other words, 
is the interpretation of configurations also invariant with respect to big-step 
reduction. This is indeed the case and proving this requires considerable effort. 


Theorem 23 (Big-step Invariance). For any configuration C, we have: 
co 
=V E Eup) 
n=0 reTerSeq<,, (C) 

The above theorem is the main result of our paper. This is a powerful result, 
because with big-step invariance in place, computational adequacy‘ Jat all types is 
now a simple consequence of the causal properties of our interpretation. Observe 
that for any configuration C, we have a subunital map o o [C] : C > C and 
evaluating it at 1 yields a real number (o 0 [C]) (1) € [0, 1]. 

Theorem 24 (Adequacy). For any normalised C : (è o [C]) (1) = Halt(C). 

If C is not normalised, then adequacy can be recovered simply by normalis- 
ing: (© o [C]) (1) = tr(C)Halt(C), for any possible configuration C. The adequacy 
formulation of and [5] is now a special case of our more general formulation. 


Corollary 25. Let M be a closed program of unit type, t.e.- + (-) M (-). Then: 


[M] | | PG) =Halt( | + | + | 2). 


Proof. By Theorem [24]and because or = id. 


* Recall that a computational adequacy result has to establish an equivalent purely 
denotational characterisation of the operational notion of non-termination. 
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6 Conclusion and Related Work 


There are many quantum programming languages described in the literature. 
For a survey see [7] and [I6] pp. 129]. Some circuit programming languages (e.g. 
Proto-Quipper [21[22]15]), generate quantum circuits, but do not necessarily 
support executing quantum measurements. Here we focus on quantum languages 
which support measurement and which have either inductive datatypes or some 
computational adequacy result. 

Our work is the first to present a detailed semantic treatment of user-defined 
inductive datatypes for quantum programming. In [I7] and [5], the authors show 
how to interpret a quantum lambda calculus extended with a datatype for lists, 
but their syntax does not support any other inductive datatypes. These lan- 
guages are equipped with lambda abstractions, whereas our language has only 
support for procedures. Lambda abstractions are modelled using constructions 
from quantitative semantics of linear logic in [I7] and techniques from game se- 
mantics in [5]. We believe our model is simpler and certainly more physically 
natural, because we work only with mathematical structures used by physicists 
in their study of quantum mechanics. Both and prove an adequacy re- 
sult for programs of unit type. In [20], the authors discuss potential categorical 
models for inductive datatypes in quantum programming, but there is no de- 
tailed semantic treatment provided and there is no adequacy result, because the 
language lacks recursion. 

Other quantum programming languages without inductive datatypes, but 
which prove computational adequacy results include [9[12]. A model based on 
W*-algebras for a quantum lambda calculus without recursion or inductive 
datatypes was described in a recent manuscript [4]. In that model, it appears 
that currying is not a Scott-continuous operation, and if so, the addition of re- 
cursion renders the model neither sound, nor adequate. For this reason, we use 
procedures and not lambda abstractions in our language. 

To conclude, we presented two novel results in quantum programming: (1) we 
provided a denotational semantics for a quantum programming language with 
inductive datatypes; (2) we proved that our denotational semantics is invariant 
with respect to big-step reduction. We also showed that the latter result is quite 
powerful by demonstrating how it immediately implies computational adequacy. 

Our denotational model is based on W*-algebras, which are used by physicists 
to study quantum foundations. We hope this would make it useful for developing 
static analysis methods (based on abstract interpretation) that can be used for 
entanglement detection [I8] and we plan on investigating this in future work. 
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Abstract. We present the spinal atomic A-calculus, a typed A-calculus 
with explicit sharing and atomic duplication that achieves spinal full 
laziness: duplicating only the direct paths between a binder and bound 
variables is enough for beta reduction to proceed. We show this calculus 
is the result of a Curry—Howard style interpretation of a deep-inference 
proof system, and prove that it has natural properties with respect to 
the A-calculus: confluence and preservation of strong normalisation. 


Keywords: Lambda-Calculus - Full laziness - Deep inference - Curry— 
Howard 


1 Introduction 


In the A-calculus, a main source of efficiency is sharing: multiple use of a single 
subterm, commonly expressed through graph reduction [27] or explicit substi- 
tution [1]. This work, and the atomic A-calculus [16] on which it builds, is an 
investigation into sharing as it occurs naturally in intuitionistic deep-inference 
proof theory [26]. The atomic A-calculus arose as a Curry—Howard interpreta- 
tion of a deep-inference proof system, in particular of the distribution rule given 
below left, a variant of the characteristic medial rule [10,26]. In the term cal- 
culus, the corresponding distributor enables duplication to proceed atomically, 
on individual constructors, in the style of sharing graphs [21]. As a consequence, 
the natural reduction strategy in the atomic A-calculus is fully lazy [27,4]: it 
duplicates only the minimal part of a term, the skeleton, that can be obtained 
by lifting out subterms as explicit substitutions. (While duplication is atomic 
locally, a duplicated abstraction does not form a redex until also its bound vari- 
ables have been duplicated; hence duplication becomes fully lazy globally.) 


This work was supported by EPSRC Project EP/R029121/1 Typed Lambda-Calculi 
with Sharing and Unsharing and ANR project 15-CE25-0014 The Fine Structure of 
Formal Proof Systems and their Computational Interpretations (FISP) 


© The Author(s) 2020 
J. Goubault-Larrecq and B. König (Eds.): FOSSACS 2020, LNCS 12077, pp. 582-601, 2020. 
https: //doi.org/10.1007/978-3-030-45231-5_30 
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ena A (B10 , sata, (AZ BC 
istrlbution: (A> B)A(A>C) witcn: A>(BAC) 


We investigate the computational interpretation of another characteristic 
deep-inference proof rule: the switch rule above right [26].° Our result is the 
spinal atomic A-calculus, a A-calculus with a refined form of full laziness, spine 
duplication. In the terminology of [4], this strategy duplicates only the spine of 
an abstraction: the paths to its bound variables in the syntax tree of the term.® 

We illustrate these notions in Figure 1, for the example Ax.Ay.((Az.z)y) a. 
The scope of the abstraction Ax is the entire subterm, Ay.((Az.z)y)x (which may 
or may not be taken to include Ax itself). Note that with explicit substitution, 
the scope may grow or shrink by lifting explicit substitutions in or out. The 
skeleton is the term Axv.Ay.(wy)x where the subterm \z.z is lifted out as an (ex- 
plicit) substitution [Az.z/w]. The spine of a term, indicated in the second image, 
cannot naturally be expressed with explicit substitution, though one can get an 
impression with capturing substitutions: it would be Axv.Ay.wax, with the sub- 
term (Az.z)y extracted by a capturing substitution [(Az.z)y/w]. Observe that 
the skeleton can be described as the iterated spine: it is the smallest subgraph 
of the syntax tree closed under taking the spine of each abstraction, i.e. that 
contains the spine of every abstraction it contains. 

These notions give rise to four natural duplication regimes. For a shared ab- 
straction to become available as the function in a 8-redex: laziness duplicates 
its scope [22]; Full laziness duplicates its skeleton [27]; Spinal full laziness du- 
plicates its spine [8]; optimal reduction duplicates only the abstraction Ax and 
its bound variables x [21,3]. 

While each of these duplication strategies has been expressed in graphs and 
labelled calculi, the atomic A-calculus is the first term calculus with Curry- 
Howard corresponding proof system to naturally describe full laziness. Likewise, 
the spinal atomic A-calculus presented here is the first term calculus with Curry- 
Howard corresponding proof system to naturally describe spinal full laziness. 


Switch and Spine. One way to describe the skeleton or the spine of an abstraction 
within a A-term is through explicit end-of-scope markers, as explored by Berkling 
and Fehr [7], and more recently by Hendriks and Van Oostrom [18]. We use 
their adbmal (A) to illustrate the idea: the constructor Ax.N indicates that the 
subterm N does not contain occurrences of x (or that any that do occur are 


5 The switch rule is an intuitionistic variant of weak or linear distributivity [12] for 
multiplicative linear logic. 

ê There is a clash of (existing) terminology: the spine of an abstraction, as we use 
here, is a different notion from the spine of a A-term, which is the path from the 
root to the leftmost variable, as used e.g. in head reduction and abstract machines. 

T Interestingly, Balabonski [5] shows that for weak reduction (where one does not 
reduce under an abstraction) full laziness and spinal full laziness are both optimal 
(in the number of beta-steps required to reach a normal form). 
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Fig. 1: Balanced and unbalanced typing derivations for Ax.Ay.((Az.z)y)x, with 
corresponding graphical representations of the term. The variable x has type A 
and y,z type A > B, shortened to BA. The left derivation isolates the skeleton 
of Ax, and the right derivation its spine, both by the subderivations in braces. 


not available to a binder Ax outside Ax.N). The scope of an abstraction thus 
becomes explicitly indicated in the term. This opens up a distinction between 
balanced and unbalanced scopes: whether scopes must be properly nested, or 
not; for example, in Ax.Ay.N, a subterm Ay.Ax.M is balanced, but Axv.Ay.M is 
not. With balanced scope, one can indicate the skeleton of an abstraction; with 
unbalanced scope (which Hendriks and Van Oostrom dismiss) one can indicate 
the spine. We do so for our example term Ax.ày.((Az.z)y)x below. 


Balanced scope/skeleton: Az. Ay.(Ay.(Ax.Az.z)y)(Ay.x) 


Unbalanced scope/spine: Ax. Ay. (Ax.(ky.Az.z)y)(Ay.x) 


A closely related approach is director strings, introduced by Kennaway and 
Sleep [19] for combinator reduction and generalized to any reduction strategy by 
Fernández, Mackie, and Sinot in [13]. The idea is to use nameless abstractions 
identified by their nesting (as with De Bruijn indices), and make the paths to 
bound variables explicit by annotating each constructor with a string of directors, 
that outline the paths. The primary aim of these approaches is to eliminate a- 
conversion and to streamline substitution. Consequently, while they can identify 
the spine, they do not readily isolate it for duplication. 

The present work starts from our observation that the switch rule of open 
deduction functions as a proof-theoretic end-of-scope construction (see [25] for 
details). However, it does so in a structural way: it forces a deconstruction of 
a proof into readily duplicable parts, which together may form the spine of 
an abstraction. The derivations in Figure 1 demonstrate this, as we will now 
explain—see the next section for how they are formally constructed. 

The abstraction Ax corresponds in the proof system to the implication A>, 
explicitly scoping over its right-hand side. On the left, with the abstraction rule 
(à), scopes must be balanced, and the proof system may identify the skeleton; 
here, that of Ax as the largest blue box. Decomposing the abstraction (A) into 
axiom (a) and switch (s), on the right the proof system may express unbalanced 
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scope. It does so by separating the scope of an abstraction into multiple parts; 
here, that of Ax is captured as the two top-level red boxes. Each box is ready to 
be duplicated; in this way, one may duplicate the spine of an abstraction only. 

These two derivations correspond to terms in our calculus. The subterms 
not part of the skeleton (i.e. Az.z) remain shared and we are able to duplicate 
the skeleton alone. This is also possible in [16]. In our calculus we are also able 
to duplicate just the spine by using a distributor. We require this construct as 
otherwise we break the binding of the y-abstraction. The distributor manages 
and maintains these bindings. The y-abstraction in the spine (y(a)) is a phantom- 
abstraction, because it is not real and we cannot perform -reduction on it. 
However, it may become real during reduction. It can be seen as a placeholder 
for the abstraction. The variables in the cover (a) represent subterms that both 
remain shared and are found in the distributor. 


Skeleton: At.rAy.(ay) x la < àz.z] 
Spine: Ax.y(a).(a) «[y(a) | Ay. [a < (Az.z)y]] 


Our investigation is then focused on the interaction of switch and distribution 
(later observed in the rewrite rule l5). The use of the distribution rule allows us 
to perform duplication atomically, and thus provides a natural strategy for spinal 
full laziness. In Figure 1 on the right, this means duplicating the two top-level 
red boxes can be done independently from duplicating the yellow box. 


2 Typing a A-calculus in open deduction 


We work in open deduction [15], a formalism of deep-inference proof theory, using 
the following proof system for (conjunction—implication) intuitionistic logic. A 
derivation from a premise formula X to a conclusion formula Z is constructed 
inductively as in Figure 2a, with from left to right: a propositional atom a, 
where X = Z = a; horizontal composition with a connective >, where X = 
Y > Xə and Z = Y > Za; horizontal composition with a connective A, where 
X = X,A Xo and Z = Z1 ^ Zo; and rule composition, where r is an inference 
rule (Figure 2b) from Y; to Y2. The boxes serve as parentheses (since derivations 
extend in two dimensions) and may be omitted. Derivations are considered up to 
associativity of rule composition. One may consider formulas as derivations that 
omit rule composition. We work modulo associativity, symmetry, and unitality 
of conjunction, justifying the n-ary contraction, and may omit T from the axiom 
rule. A 0-ary contraction, with conclusion T, is a weakening. Figure 2b: the 
abstraction rule (A) is derived from axiom and switch. Vertical composition of 
a derivation from X to Y and one from Y to Z, depicted by a dashed line, is a 
defined operation, given in Figure 2c, where * € {A,>}. 


2.1 The Sharing Calculus 


Our starting point is the sharing calculus (AS), a calculus with an explicit sharing 
construct, similar to explicit substitution. 


586 D. Sherratt et al. 


x T (X>Y)aX X 
A A | ene ig yaa 
2 il 2 Yı 
sel ys | oa l S araz X hx 
Z2 Z| |2| |”? X>(Y^nZ) Y>- Qay) =E 
| Yo (XA Y) 
2 (b) Inference rules: axiom (a), application (@), 
(a) Derivations contraction (A), switch (s), abstraction (A) 
X Xi Xi X 
X | X2 ! X 1 1 Xi|| || xe 
fo Gere ee * 
J OA -mE Fell) 2 AL 
= = = = = = ee Se | eee ees 
| I m Zı Z] Y We Yi Yo 1 | 
fy) lol a TAN lal) a 
Z Z2 Z2 Zə 


(c) Vertical composition 


Fig. 2: Intuitionistic proof system in open deduction 


Definition 1. The pre-terms r,s,t,u and sharings [T] of the A° are defined 
by: 
stz=a | Axt | st | tll] [I] s= [a1,...,2n + 8] 


with from left to right: a variable; an abstraction, where x occurs free in t and 
becomes bound; an application, where s and t use distinct variable names; and 
a closure; in t|Z < s] the variables in the vector % = £1,..., £n all occur int and 
become bound, and s and t use distinct variable names. Terms are pre-terms 
modulo permutation equivalence (~): 


t[è - s][ġ =r] ~ tlġ -rlt -= s] {9} 9(s) fo = {}) 


A term is in sharing normal form if all sharings occur as [% < x] either at 
the top level or directly under a binding abstraction, as \x.t[% < a]. 


Note that variables are linear: variables occur at most once, and bound variables 
must occur. A vector % has length |#| and consist of the variables 71,..., £z} 


An environment is a sequence of sharings [I] = [1]... [Zn]. Substitution is 
written {t/x}, and {t;/x1}...{tn/an} may be abbreviated to {t;/2;j}ie[n]- 


Definition 2. The interpretation [-]:A— A® is defined below. 
[e]=2 [Avt]=Arlé] [st]=Es]lt] [élè < s]] = Tes )/eetietny 


The translation (N) of a A-term N is the unique sharing-normal term t 
such that N = [t]. A term ¢ will be typed by a derivation with restricted types, 
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Fig. 3: Typing system for A 


as shown below, where the context type I = A1 A---A An will have an A; for each 
free variable x; of t. We connect free variables to their premises by writing A” 
and IË. The A® is then typed as in Figure 3. 


3 The Spinal Atomic A-Calculus 


We now formally introduce the syntax of the spinal atomic )-calculus (A?), by 
extending the definition of the sharing calculus in Definition 1 with a distributor 
construct that allows for atomic duplication of terms. 


Definition 3 (Pre-Terms). The pre-terms r,s,t, closures |T], and envi- 
ronments [T] of the AS are defined by: 
tous æ | st | a(y)t | [I] I] = [m] | IFI] 
[r] = [e<¢] | [ély(z) 


Our generalized abstraction «(7).t is a phantom-abstraction, where x a 
phantom-variable and the cover y will be a subset of the free variables of 
t. It can be thought of as a “delayed” abstraction: x is a binder, but possibly 
not in t itself, and instead in the terms substituted for the variables y¥; in other 
words, x is a capturing binder for substitution into y. We define standard A- 
abstraction as the special case \x.t = x(«).t, and generally, when we refer to 
x(¥) as a phantom-abstraction (rather than an abstraction) we assume 7 + x. 


The distributor u[%|y(Z)[I']] binds the phantom-variables % in u, while its 
environment [r] will bind the variables in their covers; intuitively, it represents a 
set of explicit substitutions in which the variables % are expected to be captured. 

The distributor is introduced when we wish to duplicate an abstraction, as 
depicted in Figure 4a. The sharing node (o) duplicates the abstraction node, 
creating a distributor (depiced as the sharing and unsharing node (e), together 
with the bindings of the phantom-variables (depicted with a dashed line). The 
variables captured by the environment are the variables connected to sharing 
nodes linked with a dotted line. Notice one sharing node can be linked with mul- 
tiple unsharing nodes, and vice versa. Duplication of applications also duplicates 
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(e) Duplicating the spine 


Fig. 4: Graphical illustration of the distributor 


the dotted line (Figure 4b), but these can be removed later if the term does not 
contain the variable bound to the unsharing (Figure 4c). These subterms are 
those which are not part of the spine. Eventually, we will reach a state where 
the only sharing node connected to the unsharing node is the one that shared 
the variable bound to the unsharing, allowing us to eliminate the distributor 
(Figure 4d). The purpose of the dotted line is similar to the brackets of optimal 
reduction graphs [21, 24], to supervise which sharing and unsharing match. 

Terms are then pre-terms with sensible and correct bindings. To define terms, 
we first define free and bound variables and phantom variables; variables are 
bound by abstractions (not phantoms) and by sharings, while phantom-variables 
are bound by distributors. 


Definition 4 (Free and Bound Variables). The free variables (—) ;, and 
bound variables (—)p, of a pre-term t are defined as follows 


(2) po = {2} (ew =O 
(st) fo = (8) fo U (E) fo (St)on = (8)ouU (Bor 
(a( 2).t) 5 = (t) fo- {2} (e(x).t)ou = (Ooo U {2} 
(a(i) t) po = (E) fo (al) teo = (tow 
(ul# < t]) po = (u) fo U (E) fu - {2} (ul < t])ov = (wow U (tow U {2} 
(ule ly(y) TED yo = (WED yey} (ull yy) IDo = (ul Dw vy} 
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(ul@]y(2) ET] Dro = (ULL) po Y Cyt (ulë ly(2) [T] ov = (ULL ] ow 


Definition 5 (Free and Bound Phantom-Variables). The free phantom- 
variables (-)fp and bound phantom-variables (—)», of the pre-term t are 
defined as follows 


(©) fp = {} (z) = {} 
(st) fp = (8) tp U (t) fp (st)op = (8)op U (t)op 
(a(x)t)fp = (t)fp 
(c(@).t) fp = fpu te} (c(@).t)op = (t)op 
(ul% < t]) fp = (u) fp U (t) fp (uft < t])op = (1) op U (t)op 
(ulë lele) (PN) ro = (ULL) fp - {2} 
(ul |e(9) [LP] po = (HLT) pov fo} - {2} (ul le H) op = (UL Dp v {2} 


The free covers (u)fe and bound covers (u),. are the covers associated with 
the free phantom-variables (u) fp respectively the bound phantom-variables (w) pp 
of u; that is, if z occurs as x(a) in u and a € (u) fp then (å) € (w) fe. When 
bound, x and the variables in 4 may be alpha-converted independently. When 
a distributor u[%|y(Z)[I']] binds the phantom-variables % = x1,...,2%, where 
each x; occurs as x;(G;) in u, then for technical convenience we may make the 
covers explicit in the distributor itself, and write 


ulai( di )-..tn(Gn)|y(2)[P] - 


The environment |T] is expected to bind exactly the variables in the covers (4;). 
We apply this and other restrictions to define the terms of the calculus. 


Definition 6. Terms te AS are pre-terms with the following constraints 


. Each variable may occur at most once. 

. In a phantom-abstraction x(¥).t, {9} E (t) fv. 

. Ina sharing u[è < t], {Z} € (u) fv. 

. In a distributor u[xı(ū1)...&n(än)|ylZ)[T]] 
(a) {@1,...,0n} S (U) fp; — 
(b) the variables in Ujen {Gi} are free in u and bound by [I]. 


(c) the variables in {2} occur freely in the environment [I]. 


Ww WK 


Example 1. Here we show some pre-terms that are not terms. 


—c(x).y (violates condition 2) 

— xy[z,z<w] (violates condition 3) 

— €9( wə ).We ((e1( w1 ).w1) 2) Le1( w1 ), e2( w2 ) lez) [w1, w2 — a(x).xy]] 
(violates condition 4a) 


We also work modulo permutation with respect to the variables in the cover 
of phantom-abstractions. Let 7 be a list of variables and let £p be a permutation 
of that list, then the following terms are considered equal. 
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Fig. 5: Typing derivations for phantom-abstractions and distributors 


u[z<t]~ulap < t] y(Z).t ~ y( ap ).t 


Terms are typed with the typing system for AS extended with the distribution 
inference rule. This rule is the result of computationally interpreting the medial 
rule as done in [16]. We obtain this variant of the medial rule due to the restric- 
tion for implications and to avoid introducing disjunction to the typing system. 
The terms of AÑ are then typed as in both Figure 3 and Figure 5. Note environ- 
ments are typed by the derivations of all its closures composed horizontally with 
the conjunction connective. Also note that in the case for phantom-abstraction is 
similar for that of an abstraction, where we replace one occurrence of the simple 
type A by the conjunction T. 


3.1 Compilation and Readback. 


We now define the translations between AS and the original \-calculus. First 
we define the interpretation A > AS (compilation). Intuitively, it replaces each 
abstraction Axv.- with the term z( æ }.-[z1,..., £n + x] where z1,..., £n replace 
the occurrences of x. Actual substitutions are denoted as {t/a}. Let | M |x denote 
the number of occurrences of x in M, and if |M |s = n let MZ denote M with 
the occurrences of x replaced by fresh, distinct variables £1,...,&ņn. First, the 
translation of a closed term M is (M)’, defined below 


Definition 7 (Compilation). The interpretation of À terms, (A)': A > A®, 
is defined as 


n Nk 
(MZ... E p[er}... e e x1]... [2h 28 < 25] 
Ly Tk 
where %1,...,%, are the free variables of M such that |M |z, = ni > 1 and 


(-) is defined on terms as (where n+ 1 in the abstraction case): 


(x) 


=s a(z) (M) if|M|.=1 
(MN)! =(My(N) 


Az.M y = 
1 ( M) o ONE if|M|,=n 


The readback into the A-calculus is slightly more complicated, specifically 
due to the bindings induced by the distributor. Interpreting a distributor con- 
struct as a \-term requires (1) converting the phantom-abstractions it binds in 
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u into abstractions (2) collapsing the environment (3) maintaining the bindings 
between the converted abstractions and the intended variables located in the 
environment. 


Definition 8. Given a total function o with domain D and codomain C, we 
overwrite the function with case xv where xe D andveC such that 


oļx=v](z) := if (w=z) then v else a(z) 


We use the map o as part of the translation, the intuition is that for all 
bound variables x in the term we are translating, it should be that o(x) = x. 
The purpose of the map y is to keep track of the binding of phantom-variables. 
Definition 9. The interpretation [-| - |-]: A$ x(V > A) x(V + V) +A is 
defined as 

[tloly]=o(2) — [stloly]=[slelv) tlely] 
[e(e).t|o| 7] = deLt|ole+ dla] 
[e(21,...,2n).t|o|y] =Ac.[ tl] ola: > o(2i){e/1(©) tien 17] 
[ulzi -8n © t]lo|y] =[ulolai> [tloly lien 7] 
[uler(w1),---,en( wn )le(e) LP] loly] = [ul] lylei = clien] 


[uler( wi ),...,€n( tr )le(ai,...,¢m)[F]]loly] = lull] o lylei: > clien] 
where o' = o| xi => ao(xi){c/y(c)} icn] 


The following Proposition justifies working modulo permutation equivalence. 


Proposition 1. For s,te AS, if s ~t then [s] = [t]. 


a? 


3.2 Rewrite Rules. 


Both the spinal atomic A-calculus and the atomic \-calculus of [16] follow atomic 
reduction steps, i.e. they apply on individual constructors. The biggest differ- 
ence is that our calculus is capable of duplicating not only the skeleton but also 
the spine. The rewrite rules in our calculus make use of 3 operations, substitu- 
tion, book-keeping, and exorcism. The operation substitution t{s/x} propagates 
through the term t, and replaces the free occurences of the variable x with the 
term s. Moreover, if x occurs in the cover of a phantom-variable e(#-x), then 
substitution replaces the x in the cover with (s) p,, resulting in e( ¥-(s) fy). Al- 
though substitution performs some book-keeping on phantom-abstractions, we 
define an explicit notion of book-keeping {y/e}, that updates the variables 
stored in a free cover i.e. for a term t, e( %) € (t) fe then e( y¥) € (t{y/e}.) re. The 
last operation we introduce is called exorcism {c(%)}-. We perform exorcisms 
on phantom-abstractions to convert them to abstractions. Intuitively, this will 
be performed on phantom-abstractions with phantom-variables bound to a dis- 
tributor when said distributor is eliminated. It converts phantom-abstractions 
to abstractions by introducing a sharing of the phantom-variable that captures 
the variables in the cover, i.e. (c(%).t){c(Z)}. = c(c).t[% <- c]. 
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Proposition 2. The translation | u|o|y] commutes with substitutions, book- 
keepings', and exorcisms* in the following way 


[utt/e}loly}=[uloles [tlely lo] 
[u{#/chololy]=Leloly] 


[uf{e(ai,-..,2n)deloly] = [ulolz: > clieni lV] 
(1) Given c(y) € (u) fe where CY and for z € y/%, y(c) ¢ (o(Z)) Fo 
(2) Given c(%) € (u) fe or {2} 1 (U) po = {} 
Proof. See [25], proof of Proposition 18, 19, 20, 21. 


Using these operations, we define the rewrite rules that allow for spinal du- 
plication. Firstly we have beta reduction (~g), which strictly requires an ab- 
straction (not a phantom). 


(x(a ).t) s ~g t{s/x} 


Here -reduction is a linear operation, since the bound variable x occurs exactly 
once in the body t. Any duplication of the term t in the atomic A-calculus 
proceeds via the sharing reductions. 

The first set of sharing reduction rules move closures towards the outside of 
aterm. Most of these rewrite rules only change the typing derivations in the way 
that subderivations are composed, with the exception of moving a closure out 
of scope of a distributor. 


(l) 

stl] ~z (st) [I] (l2) 
d(È)t[T] >r (a @).t) 1] if {2} 9 (t) fo = {2} (l3) 
ulé — t[r]] ~z ulë < t][T] (la) 


For the case of lifting a closure outside a distributor, we use a notation || [T] | 
to identify the variables captured by a closure, i.e. || [% < t] ||= {Z} and 

| [er( 1 ),---s€n(#e lec) T] I= {i-a}. Then let {2} =i] [F'] || in the 
following rewrite rule, where we remove Z from the covers, that can only occur 


if {2} ([T ]) po = U- 


uler( wr ).--en( Wn eC #) [PIT] 


>r Ut (Üi N Z)/ei do jepny ler ( Wi 
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The graphical version of this rule is shown in Figure 4c, where we remove 
the edge only if there is no edge between ¢ and the unsharing node. The proof 
rewrite rule corresponding with the rewrite rule l5 can be broken down into two 
parts. The first part is readjusting how the derivations compose as shown below. 


(Cs=T)AAnQ 


i 
(C3T)aAa 
2 
rad | AMS), 
C= An-:-AA FT, TNA CARE 
| ae | 
DE 21..-2n 


(Cala ACH) CESMECA 


The second part of the rewrite rule justifies the need for the book-keeping op- 
eration. In the rewrite below, let A be the type of a variable z where z e Z. 
After lifting, we want to remove the variable from the cover as to ensure cor- 
rectness since the variables in the cover denote the variables captured by the 
environment. Book-keeping allows us to remove these variables simultaneously. 


(C>T)^nA 
(C>+T)AAnA . TAA i 
WE a | A | 
SA A Ly, DIANA Big Noro dig, 
Se AN ALO SAYA ‘ 
RNa B nA)a.. E Io o 
C> D;nA 


The lifting rules (l;) are justified by the need to lift closures out of the distrib- 
utor, as opposed to duplicating them. The second set of rewrite rules, consecutive 
sharings are compounded and unary sharings are applied as substitutions. For 
simplicity, in the equivalent proof rewrite step we only show the binary case. 


ulù - y]ly-ġ <- t] ~c ulù-ğ <- t] (c1) 
ulz <« t] ~c u{t/x} (c2) 
A 


A A 
A ~O —————— A =A o A 
AA AAS AnAnA A 


The atomic steps for duplicating are given in the third and final set of rewrite 
rules. The first being the atomic duplication step of an application, which is the 
same rule used in [16]. The binary case proof rewrite steps for each rule are also 
provided. There are also shown graphically in (respectively) Figure 4b (where 
we maintain links between sharings and unsharings), Figure 4a, and Figure 4d 
(where the unsharing node is linked to exactly one connecting sharing node). 


u[zi... £n <— st] ~p u{z1 yı/z1}... {2n Yn/tn}[21---2n ©- S][yr--- Yn t] 


(dı) 
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(A> B) a, B 

e Ee (A> B)a(A>B) BAB 
Pe, e eaa aa ee eee alo 

BaB. (A> B)AA_  (A>B)AA, 

B B 
ul@1,.-.,2n<—c(y).t] >p (də) 
ufe;( wi )-Wwi/Li}ieinjler( wi )-.-en( Wn) |e( 9) [wi,...,Wn < t]] 

(A> B)Ar (A> B)ar 
BAT BALE 
Ane an A> I 
Q 7 CAC 
aN aC) asiaan 
uler(wW1)...en( Wn )|e(c) [wi,...,Wrn < cl] ~p ufer( wr) $e... {en( Wn ) be 
(d3) 


“AKA y a Aad aA 

(A> A)A(A>A) 
Example 2. The following example, illustrated in Figure 4e, is a reduction in the 
term calculus where we duplicate the spine of the term [a@1, a2 —Ax.Ay.((Az.z)y)«]. 


~p {@1(b1)-b1/a1} {©2(b2)-b2/a2} [x1 (b1), z2(b2) | 2(x)[b1, b2 —Ay.((Az.z)y) 2] 
-yr(c1)c1/a1}{x2(c2).y2(c2).c2/a2} 

1(€1), z2(c2) | x(x) [yn (c1) y2(c2) | Wy) Ler, c2 —((Az.2)y) 2] ]] 

(dy, €1). yi(di,e1)dier/ay }{£2(d2, €2). y2(d2, €2).dze€2/a2} 

[71 (di, e1), 22(d2, e2) | x(x) [y1 (d1, e1), y2(da, e2) | y(y)[di, d2- (Az.z)y][e1, e2 2] 
œr {xı (d1, e1)-yi(di)dier/a1 }{x£2(d2, €2)-yo(de).d2e2/a2} 

[x1 (d1, e1), £2(d2, e2) | (x) [yi (d1), y2(d1) | y(y) Lai, d2- (Az.z)y]] Ler, e2-2]] 

SI E x1 (e1)-yi (dı}d1e1/a1 H{£2(e2)-y2(də2). d2e2/a2} 

[v1 (e1), £2(e2) | x(x) [e1,e2<2]] [yi(di), y2(d2) | y(y)Ldi, d2 (àz.z)y]] 

~p {àz1.yı (dı }d1£1/a1 }{Ax2-yo(d2).dox2/a2} [y1 (d1), y2(d2} | y(y)[di, d2 -(Az.z)y]] 


~p { (C1 


[x 


“7D {s 


m 


Reduction (~(z,c,p,s)) preserves the conclusion of the derivation, and thus the 
following proposition is easy to observe. 


Proposition 3. If s ~(r,.c,p,8) t and s: A, then t: A. 


Definition 10. For a term t € AS, if there does not exists a term s € AS such 


that t ~(1,c,p) $ then it is said that t is in sharing normal form. 


The following Lemma not only proves we have good translations in Section 3.1, 
and shows duplication preserves denotation. 
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Lemma 1. For ate A® in sharing normal form and a N € A. 
LND] =N (eI) =t Imrea t= (M) 
Otherwise if s ~; po) t then [s|o|y] =[tloly]. 
Proof. See [25, Lemma 24, Lemma 25]. 
Lemma 2. Given a term te AS, then ([t]) is t in sharing normal form. 


Proof. We can prove this by induction on the longest sharing reduction path 
from t. Our base case is already covered by Lemma 1. We are then interested in 
the inductive case, where t is not in sharing normal form. By Lemma 1, [t] = [t’ 
where t ~(p,r,c) t. By induction hypothesis, ([¢]) is in sharing normal form. 
Hence ([t]) is in sharing normal form. (m 


4 Strong Normalisation of Sharing Reductions 


In order to show our calculus is strongly normalising, we first show that the 
sharing reduction rules are strongly normalising. We indite a measure on terms 
and show that this measure strictly decreases as sharing reduction progresses. 
Similar ideas and results can be found elsewhere: with memory in [20], the A- 
I calculus in [6], the A-void calculus [2], and the weakening Ay-calculus [17]. 
Our measure will consist of three components. First, the height of a term is a 
multiset of integers, that measures the number of constructors from each sharing 
node to the root of the term in its graphical notation. The height is defined on 
terms as H'(-), where i is an integer. We say H(t) for H'(t). We use u to 
denote the disjoint union of two multisets. We denote H’([I\]) u -u H'([In]) 


as H'([I’]) for the environment [T] = [11], ..., [Dn]. 


Definition 11 (Sharing Height). The sharing height H'(t) of a term t is 
given below, where n is the number of closures in [T]: 


H'(x) = } H' (st) = H”! (s) u H?! (t) 
Hi (c( #).t) = H+ (t) H'(t{l]) = H'(t) UHT] v {i} 
Hi ([21,.--,2nt))=H*(t) HY([d|c(#)[T]]) =H (7) v {G+ 1)"} 


This measure then strictly decreases for the rewrite rules l1, l2, l3, l4 and Is, i.e. 
if t ~z u then H'(t) > H'(u). The second measure we consider is the weight of a 
term. Intuitively this quantifies the remaining duplications, which are performed 
with ~p reductions. If a term would be deleted, we assign it with a weight ‘1’ 
to express that it is not duplicated. Calculating the weight requires an auxiliary 
function that assigns integer weights to the variables of a term. This function is 
defined on terms V’ (—), where i is an integer. To measure variables independently 
of binders is vital. It allows to measure distributors, which duplicate ’s but not 
the bound variable. Also, only bound variables for abstractions are measured 
since variables bound by sharings are substituted in the interpretation. 


596 D. Sherratt et al. 


Definition 12 (Variable Weights). The function V'(t) returns a function 
that assigns integer weights to the free variables of t. It is defined by the below, 
where f = V'(t) and g= f(a1)+---+ f(&n) for each x; € è. 


Pe- een Vi(st) = vi(s) eV 
Vi(ele).t) = VOH Vi(ol#).2) = V(t) u {c> i} 
viž- s= VOKE uV (s) viae- s) =V) eV\(s) 


V' (tle: (w1)...en( ön) lele) [P]]) = V(t ])/{e,e1,..-,€n} 


V' (tle w1)... en( Wn) eZ) [T] =V (tL ])/{er,---,en}u {er i} 


The weight of a term can then be defined via the use of this auxiliary function. 
The auxiliary function is used when calculating the weight of a sharing, where 
the sharing weight of the variables bound by the sharing play a significant role 
in calculating the weight of the shared term. In the case of a weakening [< t], 
we assign an initial weight of 1. Again we say W(t) = W! (t). 


Definition 13 (Sharing Weight). The sharing weight W'(t) of a term t is a 
multiset of integers computed by the function defined below, where f = V'(t) and 
g=f(a1) +--+ f(a@n) for each x; € ț. 


Wi (x) = {} W’ (st) =W'(s) u W(t) u {i} 
W'(c(c).t) = Wi(t) u {i} u {i )} W' (el &).t) =W (t) u {i} 
W’ (t[Z < s]) = W'(t) u W9(s) Wi (t[< s]) = W'(t) u W! (s8) 


W'(tLer( tir) .--€n( Wn )|e(e) [T]]) = WED o VEIO 


Wi (tLer( tr)... €n( tn) lež) = wE 


This measure then strictly decreases on the rewrite rules d1, d2, d3 and is unaf- 
fected by all the other sharing reduction rules, i.e. if t >p u then W'(t) > W'(u). 
If t ~(z,c) u then W'(t) = W'(u). The third and last measure we consider is 
the number of closures in the term, where it can be easily observed that the 
rewrite rules cı and c2 strictly decrease this measure, and that the ~z rules do 
not alter the number of closures. We then use this along with height and weight 
to define a sharing measure on terms. 


Definition 14. The sharing measure of a AS-term t is a triple (W(t), C, 
H(t)), where C is the number of closures in the term t. We compare sharing 
measures by using the lexicographical preferences according to W > C > H. 


Theorem 1. Sharing reduction ~œ(p z œ) is strongly normalising. 


Now that we have proven the sharing reductions are strongly normalising, we 
can prove that they are confluent for closed terms. 


Theorem 2. The sharing reduction relation ~œ(p z c) is confluent. 
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Proof. Lemma 1 tells us that the preservation is preserved under reduction i.e. for 
8 ~(p,L,c) t [s] = [t]. Therefore given t ~(p z cy sı and t ~(p 7 cy s2 [t] = 
[ 51 ] = [s2]. Since we know that sharing reductions are strongly normalising, we 
know there exists terms u; and u2 in sharing normal form such that sı a D,L,C) 
uy, and s2 a D,L,C) U2: Lemma 1 tells us that terms in sharing normal form are in 
correspondence with their denotations i.e. (| t] ) =t. Since by Lemma 1 we know 
[ui] = [s1] = [s2] = [ue], and by Lemma 1 ([ wij) = ui and ([u2]) = u2, we 
can conclude u1 = u2. Hence, we prove confluence. Oo 


5 Preservation of Strong Normalisation and Confluence 


A 6-step in our calculus may occur within a weakening, and therefore is simulated 
by zero 6-steps in the A-calculus. Therefore if there is an infinite reduction path 
located inside a weakening in Ae , then the reduction path is not preserved in the 
corresponding A-term as there are no weakenings. To deal with this, just as done 
in [2, 16, 17], we make use of the weakening calculus. A 8-step is non-deleting 
precisely because of the weakening construct. If a 6-step would be deleting, then 
the weakening calculus would instead keep the deleted term around as ‘garbage’, 
which can continue to reduce unless explicitly ‘garbage-collected’ by extra (non- 
B) reduction steps. PSN has already be shown for the weakening calculus through 
the use of a perpetual strategy in [16]. A part of proving PSN is then using the 
weakening calculus to prove that if t € AS has a infinite reduction path, then its 
translation into the weakening calculus also has an infinite reduction path. 


Definition 15. The w-terms of the weakening calculus (Aw) are 
T,U,V s= @ | Aged” | UV | Ted]. | ©() whereve(T) 


The terms are variable, abstraction, application, weakening, and a bullet. 
In the weakening T[< U], the subterm U is weakened. The interpretation of 
atomic terms to weakening terms |-|- |-], can be seen as an extension of the 
translation into the A-calculus (Definition 9). 


Definition 16. The interpretation [-|—|-]w:A2x(V > Aw)x(V > V) > Aw 
with maps o : V > Ay andy: V > V is defined as an extension of the translation 
in (Definition 9) with the following additional special cases. 


Lul- tlo lylw = Lule label [lol yb] 
[ul lee) ol ybw = lole «ly hw 
Lalleen -2n To llw = LulFTl 0 yw 


where o'(z):= if 2€ {@1,...,Un} then a(z){e/y(c)} else o(z) 


We say [¢]” = [¢|Z|Z]w where 7 is the identity function. We also have trans- 
lations of the weakening calculus to and from the A-calculus. Both of these 
translations were provided in [16]. The interpretation | — | from weakening terms 
to A-terms discards all weakenings. 
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Definition 17. The interpretation M € A, (-)” : A > Ay is defined below. 
Ax. N)” if x€(N) fo 


(2)%=2 (MN)Y=(M)yr (Ny pame ee eal otherwise 


The following equalities can be observed, where o“(z) = | a” (z) J. 
Proposition 4. For N € A and t€ A® the following properties hold 

Litl” lyfe l=Elo4le) LOND LUND JEN 
where for each {x + M} co”, {26 | M |} eot. 


Definition 18. In the weakening calculus, B-reduction is defined as follows, 


where |T] are weakening constructs. ((Az.T)[I"]) U og T{U/x}[L] 


w 


Proposition 5. If N € A is strongly normalising, then so is (N) 


When translating from AS to Aw, weakenings are maintained whilst shar- 
ings are interpreted via substitution. Thus the reduction rules in the weakening 
calculus cover the spinal reductions for nullary distributors and weakenings. 


Definition 19. Weakening reduction (>w) proceeds as follows. 


U[-T]V >w (UV)[< T] UV[<T] >w (UV)[- T] 
T[- Ul- V]] >w T[- U][- V]  T[- Ar.U] >w T[- Uf{e/z}] 
T[- UV] >w T[- U][- V] T[< è] >w T 
TeU ay T” Me THU) >w (Az.T)[- U]®P 


(1) if U is a subterm of T and (2) if x ¢ (U)fv 


It is easy to see that these rules correspond to special cases of the sharing 
reduction rules for AS. This resemblance is confirmed by the following Lemma, 
proven in [25, pp. 82-86]. We use this to show how A® enjoys PSN. 


Lemma 3. Ift ~g u then [t]” >} [u]”. Ift ~(o,p,1) u and for any x € 
(t)bv U(t) fp such that for all z, x ¢ (a(z)) fv. 


[tlolylw >% Lelolydw 


Lemma 4. For t € AS has an infinite reduction path, then [t]” also has an 
infinite reduction path. 


Proof. Due to Theorem 2, we know that the infinite reduction path contains 
infinite 6-steps. This means in the reduction sequence, between each {-step, 
there are finite many ~(p,z,c) reduction steps. Lemma 3 says each ~(p,1,0) 
step in AF corresponds to zero or more weakening reductions (~*,). Lemma 
3 says that each beta step in AS corresponds to one or more -steps in Aw. 
Therefore, it must be that |t|” also has an infinite reduction path. m 
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Theorem 3. If N € A is strongly normalising, then so is (N). 


Proof. For a given N € A that is strongly normalising, we know by Lemma 5 
that ( N)” is strongly normalising. Then [ (NV) ]” is strongly normalising, since 
Proposition 4 states that (N)” =[ (NV) ]”. Then by Lemma 4, which states that 
if [¢]” is strongly normalising, then t is strongly normalising, proves that (N) 
is strongly normalising. Oo 


We also prove confluence, which is already known for the A-calculus [11]. We 
first observe that a 6-step in the A-calculus is simulated in AS by one (6-step 
followed by zero or more sharing reductions. 


Lemma 5. Given N, Me A. If N ~g M, then (N) ~g ~(D,L,C) (M). 


Proof. This is proven by Sherratt in [25, Lemma 67]. 


Theorem 4. Given t,s1,S2 € AS: Ift ig D,L,C) $1 and t ig D,L,C) $2 there 


exists a u€ AS such that sı ig D,L,C) U and s2 ~(6,D,L c) U- 


Proof. Suppose t ~(8,D,L,C) sı and t ~(B,D,L,C) s2. Then we have [t] ~% [s1] 
and [t] ~% [s2]. By the Church-Rosser theorem, there exists a M € A such that 
[sı] ~% M and [s2] ~% M. Due to Lemma 2, (fsı]) = si and ([s2]) = s 
where s,s} € A? in sharing normal form. Then thanks to Lemma 5 we know 
si ~(,p,1,c) (M) and s3 >¢t5 p z,c) (M). Combined, we get confluence. m 


6 Conclusion, related work, and future directions 


We have studied the interaction between the switch and the medial rule, the 
two characteristic inference rules of deep inference. We built a Curry-Howard 
interpretation based on this interaction, whose resulting calculus not only has 
the ability to duplicate terms atomically but can also duplicate solely the spine 
of an abstraction such that beta reduction can proceed on the duplicates. We 
show that this calculus has natural properties with respect to the A-calculus. 

This work, which started as an investigation into the Curry-Howard corre- 
spondence of the switch rule [25], fits into a broader effort to give a computational 
interpretation to intuitionistic deep-inference proof theory. Brünnler and McKin- 
ley [9] give a natural reduction mechanism without medial (or switch), and ob- 
serve that preservation of strong normalization fails. Guenot and Straßburger [14] 
investigate a different switch rule, corresponding to the implication-left rule of 
sequent calculus. He [17] extends the atomic A-calculus to the \y-calculus. 

Our future goal is to develop the intuitionistic open deduction formalism to- 
wards optimal reduction [23, 21, 3], via the remaining medial and switch rules [26]. 
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Abstract. In this paper, we study active learning algorithms for weighted 
automata over a semiring. We show that a variant of Angluin’s seminal 
L* algorithm works when the semiring is a principal ideal domain, but 
not for general semirings such as the natural numbers. 


1 Introduction 


Angluin’s seminal L* algorithm [4] for active learning of deterministic automata 
(DFAs) has been successfully used in many verification tasks, including in au- 
tomatically building formal models of chips in bank cards or finding bugs in 
network protocols (see [27,14] for a broad overview of successful applications 
of active learning). While DFAs are expressive enough to capture interesting 
properties, certain verification tasks require more expressive models. This moti- 
vated several researchers to extend L* to other types of automata, notably Mealy 
machines [28,24], register automata [15,22,1], and nominal automata [20]. 
Weighted finite automata (WFAs) are an important model made popular due 
to their applicability in image processing and speech recognition tasks [11,21]. 
The model is prevalent in other areas, including bioinformatics [2] and formal 
verification [3]. Passive learning algorithms and associated complexity results 
have appeared in the literature (see e.g. [5] for an overview), whereas active 
learning has been less studied [6,7]. Furthermore, the existing learning algo- 
rithms, both passive and active, have been developed assuming the weights in 
the automaton are drawn from a field, such as the real numbers.* To the best 
of our knowledge, no learning algorithms, whether passive or active, have been 
developed for WFAs in which the weights are drawn from a general semiring. 


* The research leading to this work was partially funded by the European Union’s 
Horizon 2020 research and innovation programme under the ERC Starting Grant 
ProFoundNet (grant code 679127) and the Marie Skłodowska-Curie Grant Agree- 
ment No. 795119, by the EPSRC Standard Grant CLeVer (EP/S028641/1) and by 
GCHQ via the VeTSS grant “Automated black-box verification of networking sys- 
tems” (4207703/RFA 15845). 

t Balle and Mohri [6] define WFAs generically over a semiring but then restrict to fields 
from Section 3 onwards as they present an overview of existing learning algorithms. 
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In this paper, we explore active learning for WFAs over a general semiring. 
The main contributions of the paper are as follows: 


1. We introduce a weighted variant of L* parametric on an arbitrary semiring, 
together with sufficient conditions for termination (Section 4). 

2. We show that for general semirings our algorithm might not terminate. In 
particular, if the semiring is the natural numbers, one of the steps of the 
algorithm might not converge (Section 5). 

3. We prove that the algorithm terminates if the semiring is a principal ideal 
domain, covering the known case of fields, but also the integers. This yields 
the first active learning algorithm for WFAs over the integers (Section 6). 


We start in Section 2 by explaining the learning algorithm for WFAs over 
the reals and pointing out the challenges in extending it to arbitrary semirings. 


2 Overview of the Approach 


In this section, we give an overview of the work developed in the paper through 
examples. We start by informally explaining the general algorithm for learning 
weighted automata that we introduce in Section 4, for the case where the semir- 
ing is a field. More specifically, for simplicity we consider the field of real numbers 
throughout this section. Later in the section, we illustrate why this algorithm 
does not work for an arbitrary semiring. 

Angluin’s L* algorithm provides a procedure to learn the minimal DFA ac- 
cepting a certain (unknown) regular language. In the weighted variant we will 
introduce in Section 4, for the specific case of the field of real numbers, the al- 
gorithm produces the minimal WFA accepting a weighted rational language (or 
formal power series) £L: A* > R. 

A WFA over R consists of a set of states, a linear combination of initial 
states, a transition function that for each state and input symbol produces a 
linear combination of successor states, and an output value in R for each state 
(Definition 5). As an example, consider the WFA over A = {a} below. 


wl 02 


THE 

ae 

Here qo is the only initial state, with weight 1, as indicated by the arrow into 
it that has no origin. When reading a, go transitions with weight 1 to itself and 
also with weight 1 to q1; qi transitions with weight 2 just to itself. The output 
of qo is 2 and the output of q is 3. 

The language of a WFA is determined by letting it read a given word and 
determining the final output according to the weights and outputs assigned to 
individual states. More precisely, suppose we want to read the word aaa in the 
example WFA above. Initially, qo is assigned weight 1 and qı weight 0. Processing 
the first a then leads to qo retaining weight 1, as it has a self-loop with weight 1, 
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and qı obtaining weight 1 as well. With the next a, the weight of qo still remains 
1, but the weight of qı doubles due to its self-loop of weight 1 and is added to 
the weight 1 coming from qo, leading to a total of 3. Similarly, after the last a 
the weights are 1 for go and 7 for qı. Since qo has output 2 and qı output 3, the 
final result is 2-1+3-7 = 23. 

The learning algorithm assumes access to a teacher (sometimes also called 
oracle), who answers two types of queries: 


— membership queries, consisting of a single word w € A*, to which the teacher 
replies with a weight L(w) € R; 

— equivalence queries, consisting of a hypothesis WFA A, to which the teacher 
replies yes if its language £ 4 equals the target language £ and no otherwise, 
providing a counterexample w € A* such that L(w) # La(w). 


In practice, membership queries are often easily implemented by interacting with 
the system one wants to model the behaviour of. However, equivalence queries 
are more complicated—as the perfect teacher does not exist and the target au- 
tomaton is not known they are commonly approximated by testing. Such testing 
can however be done exhaustively if a bound on the number of states of the tar- 
get automaton is known. Equivalence queries can also be implemented exactly 
when learning algorithms are being compared experimentally on generated au- 
tomata whose languages form the targets. In this case, standard methods for 
language equivalence, such as the ones based on bisimulations [9], can be used. 

The learning algorithm incrementally builds an observation table, which at 
each stage contains partial information about the language £ determined by two 
finite sets S, Æ C A*. The algorithm fills the table through membership queries. 
As an example, and to set notation, consider the following table (over A = {a}). 


E row: S > RË 
fea aa row(u)(v) = L(uv) 
el01 3 
S all3 7 srow: S- A> RË 
S. Ace 7 15 srow(ua)(v) = L(uav) 


This table indicates that £ assigns 0 to £, 1 to a, 3 to aa, 7 to aaa, and 
15 to aaaa. For instance, we see that row(a)(aa) = srow(aa)(a) = 7. Since row 
and srow are fully determined by the language £, we will refer to an observation 
table as a pair (S, E), leaving the language £ implicit. 

If the observation table (S, E) satisfies certain properties described below, 
then it represents a WFA (S, ô, i, 0), called the hypothesis, as follows: 


— 5: S — (R*)4 is a linear map defined by choosing for 6(s)(a) a linear com- 
bination over S of which the rows evaluate to srow(sa); 

— i: S > Ris the initial weight map defined as i(¢) = 1 and i(s) = 0 for s Æ €; 

— o: S > R is the output weight map defined as o(s) = row(s)(e). 
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For this to be well-defined, we need to have e € S (for the initial weights) and 
£ € E (for the output weights), and for the transition function there is a crucial 
property of the table that needs to hold: closedness. In the weighted setting, a 
table is closed if for allt € S - A, there exist r, € R for all s € S such that 


srow(t) = 5 Ts: row(s). 


ses 


If this is not the case for a given t € S- A, the algorithm adds t to S. The table 
is repeatedly extended in this manner until it is closed. The algorithm then 
constructs a hypothesis, using the closedness witnesses to determine transitions, 
and poses an equivalence query to the teacher. It terminates when the answer is 
yes; otherwise it extends the table with the counterexample provided by adding 
all its suffixes to F, and the procedure continues by closing again the resulting 
table. In the next subsection we describe the algorithm through an example. 


Remark 1. The original L* algorithm requires a second property to construct 
a hypothesis, called consistency. Consistency is difficult to check in extended 
settings, so the present paper is based on a variant of the algorithm inspired by 
Maler and Pnueli [19] where only closedness is checked and counterexamples are 
handled differently. See [13] for an overview of consistency in different settings. 


2.1 Example: Learning a Weighted Language over the Reals 


Throughout this section we consider the following weighted language: 
L: {a} >R L(a) = 2 — 1. 


The minimal WFA recognising it has 2 states. We will illustrate how the weighted 
variant of Angluin’s algorithm recovers this WFA. 

We start from S = E = {e}, and fill the entries of the table on the left below 
by asking membership queries for £ and a. The table is not closed and hence we 
build the table on its right, adding the membership result for aa. The resulting 
table is closed, as srow(aa) = 3 - row(a), so we construct the hypothesis 44. 


E ee ees 
eo €|0 _ 
aL all =E 
a aal3 =a 


The teacher replies no and gives the counterexample aaa, which is assigned 9 by 
the hypothesis automaton A; but 7 in the language. Therefore, we extend E + 
E U {a,aa,aaa}. The table becomes the one below. It is closed, as srow(aa) = 
3- row(a) — 2 - row(e), so we construct a new hypothesis A2. 
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13 7 15 
3715 31 


_ |E aaa aaa a,—2 
e013 7 A, ==] aloes 
a a4. 


a 


a 


The teacher replies yes because A» accepts the intended language assigning 2 — 
1 € R to the word a’, and the algorithm terminates with the correct automaton. 


2.2 Learning Weighted Languages over Arbitrary Semirings 


Consider now the same language as above, but represented as a map over the 
semiring of natural numbers £: {a}* — N instead of a map £: {a}* > R over 
the reals. Accordingly, we consider a variant of the learning algorithm over the 
semiring N rather than the algorithm over R described above. For the first part, 
the run of the algorithm for N is the same as above, but after receiving the 
counterexample we can no longer observe that srow(aa) = 3 - row(a) — 2 - row(e), 
since —2 ¢ N. In fact, there are no m,n € N such that srow(aa) = m - row(e) + 
n - row(a). To see this, consider the first two columns in the table and note that 
3 is bigger than g = 0 and L, so it cannot be obtained as a linear combination 
of the latter two using natural numbers. We thus have a closedness defect and 
update S + SU {aa}, leading to the table below. 


aa aaa 
3 
7 1 
53 

6 


“IW eja 
N 


p.i: 
or 
w| = 
part 

= Ot 


aa 


E 
0 
ajl 
3 
aaa|7 3 


> 2. In fact, these closedness defects 
to non-termination of the algorithm. 


Again, the table is not closed, since 

continue appearing indefinitely, leadi 

This is shown formally in Section 5. 
Note, however, that there does exist a WFA over N accepting this language: 


gan 


a,1 a2 


(0/9 a aÀ) (1) 


The reason that the algorithm cannot find the correct automaton is closely 
related to the algebraic structure induced by the semiring. In the case of the reals, 
the algebras are vector spaces and the closedness checks induce increases in the 
dimension of the hypothesis WFA, which in turn cannot exceed the dimension 
of the minimal one for the language. In the case of commutative monoids, the 
algebras for the natural numbers, the notion of dimension does not exist and 
unfortunately the algorithm does not terminate. In Section 6 we show that one 
can get around this problem for a class of semirings which includes the integers. 
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We mentioned earlier that during experimental evaluation the target WFA 
is known, and equivalence queries may be implemented via standard language 
equivalence methods. A further issue with arbitrary semirings is that language 
equivalence can be undecidable; that is the case, e.g., for the tropical semiring. 

In Section 3 we recall basic definitions used throughout the paper, after which 
Section 4 introduces our general algorithm with its (parameterised) termination 
proof of Theorem 14. We then proceed to prove non-termination of the example 
discussed above over the natural numbers in Section 5 before instantiating our 
algorithm to PIDs in Section 6 and showing that it terminates in Theorem 28. 
We conclude with a discussion of related and future work in Section 7. 


3 Preliminaries 


Throughout this paper we fix a semiring® S and a finite alphabet A. We start 
with basic definitions related to semimodules and weighted languages. 


Definition 2 (Semimodule). A (left) semimodule M over S consists of a 
monoid structure on M, written using + as the operation and 0 as the unit, 
together with a scalar multiplication map +: S x M —> M such that: 


s-Ouw =0mu 0s- m = 0m 1- 


s:(m+n)=s-m+s-n (8s+r)-m=s-:m+r-m (sr): 


When the semiring is in fact a ring, we speak of a module rather than a semi- 
module. In the case of a field, the concept instantiates to a vector space. 


As an example, commutative monoids are the semimodules over the semiring 
of natural numbers. Any semiring forms a semimodule over itself by instantiating 
the scalar multiplication map to the internal multiplication. If X is any set and M 
is a semimodule, then M* with pointwise operations also forms a semimodule. 
A similar semimodule is the free semimodule over X, which differs from M* 
in that it fixes M to be S and requires its elements to have finite support. This 
enables an important operation called linearisation. 


Definition 3 (Free semimodule). The free semimodule over a set X is given 
by the set 
V(X) ={f: X >S | supp(f) is finite} 


with pointwise operations. Here supp(f) = {a € X | f(a) # 0}. We some- 
times identify the elements of V(X) with formal sums over X. Any semimodule 
isomorphic to V(X) for some set X is called free. 


If X is a finite set, then V(X) = S*. We now define linearisation of a function 
into a semimodule, which uniquely extends it to a semimodule homomorphism, 
witnessing the fact that V(X) is free. 


5 Rings and semirings considered in this paper are taken to be unital. 
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Definition 4 (Linearisation). Given a set X, a semimodule M, and a func- 
tion f: X + M, we define the linearisation of f as the semimodule homomor- 
phism fË: V(X) 4 M given by 


wExX 


The (—)* operation has an inverse that maps a semimodule homomorphism 
g: V(X) — M to the function gt: X —> M given by 


1 ify=2 
t(x = Ox), On = r 
g'(x) = g(Or) (y) f puke 


We proceed with the definition of WFAs and their languages. 


Definition 5 (WFA). A weighted finite automaton (WFA) over S is a tuple 
(Q,6,i,0), where Q is a finite set, 6: Q > (S®)4, andi,o: Q >S. 


A weighted language (or just language) over S is a function A* — S. To define 
the language accepted by a WFA A = (Q, ð, i, o), we first introduce the notions 
of observability map obs,4: V(Q) => S^ and reachability map reach,: V(A*) > 
V(Q) as the semimodule homomorphisms given by 


reach", (£) =i obs.4(m)(€) = # (m) 
reach", (ua) = (reacht, (u))(a) obs.4(m) (au) = obs4 (8 (m)(a))(u). 


The language accepted by a WFA A = (Q,ð,i,o) is the function L4: A* > S 
given by £4 = obs4 (i). Equivalently, one can define this as L4 = oË o reach"). 


4 General Algorithm for WFAs 


In this section we define the general algorithm for WFAs over S, as described 
informally in Section 2. Our algorithm assumes the existence of a closedness 
strategy (Definition 8), which allows one to check whether a table is closed, and 
in case it is, provide relevant witnesses. We then introduce sufficient conditions 
on S and on the language £ to be learned under which the algorithm terminates. 


Definition 6 (Observation table). An observation table (or just table) (S, E) 
consists of two sets S,E C A*. We write Tables, = Pr(A*) x Py(A*) for 
the set of finite tables (where Py(X) denotes the collection of finite subsets 
of a set X). Given a language L: A* —> S, an observation table (S,E) de- 
termines the row function rows pc): S > SË and the successor row function 
srow(s, gc): 9° A> S? as follows: 


row(s,6,£)(w)(v) = L(wv) srow(s p c) (wa) (v) = L(wav). 


We often write rows and srowg, or even row and srow, when the parameters are 
clear from the contest. 
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A table is closed if the successor rows are linear combinations of the existing 
rows in S$. To make this precise, we use the linearisation row* (Definition 4), 
which extends row to linear combinations of words in S. 


Definition 7 (Closedness). Given a language £, a table (S, E) is closed if for 
allw € S anda€ A there exists a € V(S) such that srow(wa) = row? (a). 


This corresponds to the notion of closedness described in Section 2. 
A further important ingredient of the algorithm is a method for checking 
whether a table is closed. This is captured by the notion of closedness strategy. 


Definition 8 (Closedness strategy). Given a language £, a closedness strat- 
egy for L is a family of computable functions 


(cscs): S- A> {L}U V(S)) c5, E)ETablen 


satisfying the following two properties: 


— if cs;s,m)(t) = L, then there is no a € V (S) s.t. row#(a) = srow(t), and 
— if cs(s,p) (t) # L, then row#(cs;s, my (t)) = srow(t). 


Thus, given a closedness strategy as above, a table (S, Æ) is closed iff css m) (t) # 
L for all t € S- A. More specifically, for each t € S- A we have that css z)(t) Æ L 
iff the (successor) row corresponding to t already forms a linear combination of 
rows labelled by S. In that case, this linear combination is returned by cs(s, p) (t). 
This is used to close tables in our learning algorithm, introduced below. 

Examples of semirings and (classes of) languages that admit a closedness 
strategy are described at the end of this section. Important for our algorithm 
will be that closedness strategies are computable. This problem is equivalent to 
solving systems of equations Ax = b, where A is the matrix whose columns are 
row(s) for s € S, x is a vector of length |S], and b is the vector consisting of 
the row entries in srow(t) for some t € S- A. These observations motivate the 
following definition. 


Definition 9 (Solvability). A semiringS is solvable if a solution to any finite 
system of linear equations of the form Ax = b is computable. 


We have the following correspondence. 


Proposition 10. For any language accepted by a WFA over any semiring there 
exists a closedness strategy if and only if the semiring is solvable. 


Proof. If the semiring is solvable, we obtain a closedness strategy by the remarks 
prior to Definition 9. Conversely, we can construct a language that is non-zero 
on finitely many words and encode in a table (S, Æ) a given linear equation. To 
be able to freely choose the value in each table cell, we can consider a sufficiently 
large alphabet to make sure S and E contain only single-letter words. This avoids 
dependencies within the table. 
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Algorithm 1 Abstract learning algorithm for WFA over S 
1: S,E« {e} 
2: while true do 
3: while cs(s,m) (t) = L for some t € S- A do 


4 S¢ Su {t} 

5 for s € S do 

6: o(s) 4 rowz(s)(e) 

T: for a € A do 

8 ô(s)(a) + cscs, 2) (sa) 

9: if EQ(S,6,¢,0) = w € A* then 
10: E + EUsuffixes(w) 

11; else 

12: return (S,6,¢,0) 


We now have all the ingredients to formulate the algorithm to learn weighted 
languages over a general semiring. The pseudocode is displayed in Algorithm 1. 

The algorithm keeps a table (S, E), and starts by initialising both S$ and E to 
contain just the empty word. The inner while loop (lines 3-4) uses the closedness 
strategy to repeatedly check whether the current table is closed and add new 
rows in case it is not. Once the table is closed, a hypothesis is constructed, 
again using the closedness strategy (lines 5-8). This hypothesis (S,6,¢,0) is 
then given to the teacher for an equivalence check. The equivalence check is 
modelled by EQ (line 9) as follows: if the hypothesis is incorrect, the teacher 
non-deterministically returns a counterexample w € A*, the condition evaluates 
to true, and the suffixes of w are added to Æ; otherwise, if the hypothesis is 
correct, the condition on line 9 evaluates to false, and the algorithm returns 
the correct hypothesis on line 12. 


4.1 Termination of the General Algorithm 


The main question remaining is: under which conditions does this algorithm 
terminate and hence learns the unknown weighted language? We proceed to give 
abstract conditions under which it terminates. There are two main assumptions: 


1. A way of measuring progress the algorithm makes with the observation table 
when it distinguishes linear combinations of rows that were previously equal, 
together with a bound on this progress (Definition 11). 

2. An assumption on the Hankel matrix of the input language (Definition 12), 
which makes sure we encounter finitely many closedness defects through- 
out any run of the algorithm. More specifically, we assume that the Hankel 
matrix satisfies a finite approximation property (Definition 13). 


The first assumption is captured by the definition of progress measure: 


Definition 11 (Progress measure). A progress measure for a language L is 
a function size: Tables, —> N such that 
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(a) there exists n E€ N such for all (S, E) € Tablefin we have size(S, E) < n; 

(b) given (S,E),(S,E’) € Tablesn and s1,52 E€ V(S) such that E C E" and 
row{ g pc) (81) = row! sm c)(82) but row’ 5 py c)(81) x row’ pyc) (82); we 
have size(S, E’) > size(.S, E). 


A progress measure assigns a ‘size’ to each table, in such a way that (a) there is a 
global bound on the size of tables, and (b) if we extend a table with some proper 
tests in E, i.e., such that some combinations of rows in row’ that were equal 
before get distinguished by a newly added test, then the size of the extended 
table is properly above the size of the original table. This is used to ensure that, 
when adding certain counterexamples supplied by the teacher, the size of the 
table, measured according to the above size function, properly increases. 

The second assumption that we use for termination is phrased in terms of 
the Hankel matrix associated to the input language £, which represents £ as the 
(semimodule generated by the) infinite table where both the rows and columns 
contain all words. The Hankel matrix is defined as follows. 


Definition 12 (Hankel matrix). Given a language L: A* —> S, the semi- 
module generated by a table (5, E) is given by the image of row’. We refer to 
the semimodule generated by the table (A*, A*) as the Hankel matrix of £. 


The Hankel matrix is approximated by the tables that occur during the execution 
of the algorithm. For termination, we will therefore assume that this matrix 
satisfies the following finite approximation condition. 


Definition 13 (Ascending chain condition). We say that a semimodule M 
satisfies the ascending chain condition if for all inclusion chains of subsemimod- 
ules of M, 

pi CBee eye, 


there exists n E€ N such that for all m > n we have Sm = Sn. 


Given the notions of progress measure, Hankel matrix and ascending chain 
condition, we can formulate the general theorem for termination of Algorithm 1. 


Theorem 14 (Termination of the abstract learning algorithm). In the 
presence of a progress measure, Algorithm 1 terminates whenever the Hankel ma- 
trix of the target language satisfies the ascending chain condition (Definition 13). 


Proof. Suppose the algorithm does not terminate. Then there is a sequence 
{(Sn;En)}nen of tables where (So, Eo) is the initial table and (Sp41, En+1) is 
formed from (Sn, En) after resolving a closedness defect or adding columns due 
to a counterexample. 

We write Hn for the semimodule generated by the table (Sn, A*). We have 
Sn C Snp1 and thus Hn C Hn+1. Note that a closedness defect for (Sn, Æn) is 
also a closedness defect for (Sn, A*), so if we resolve the defect in the next step, 
the inclusion Hp, C H,,4, is strict. Since these are all included in the Hankel 
matrix, which satisfies the ascending chain condition, there must be an n such 
that for all k > n we have that (Sp, Ep) is closed. 
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In [13, Section 6] it is shown that in a general table used for learning automata 
with side-effects given by a monad there exists a suffix of each counterexample 
for the corresponding hypothesis that when added as a column label leads to 
either a closedness defect or to distinguishing two combinations of rows in the 
table. Since WFAs are automata with side-effects given by the free semimodule 
monad? and we add all suffixes of the counterexample to the set of column 
labels, this also happens in our algorithm. Thus, for all k > n where we process a 
counterexample, there must be two linear combinations of rows distinguished, as 
closedness is already guaranteed. Then the semimodule generated by (Sk, Ep) is 
a strict quotient of the semimodule generated by (Sk+1, Ek+1). By the progress 
measure we then find size(S,, Ek) < size(Sk+1, £441), which cannot happen 
infinitely often. We conclude that the algorithm must terminate. 


To illustrate the hypotheses needed for Algorithm 1 and its termination (The- 
orem 14), we consider two classes of semirings for which learning algorithms are 
already known in the literature [7,13]. 


Example 15 (Weighted languages over fields). Consider any field for which the 
basic operations are computable. Solvability is then satisfied via a procedure such 
as Gaussian elimination, so by Proposition 10 there exists a closedness strategy. 
Hence, we can instantiate Algorithm 1 with S being such a field. 

For termination, we show that the hypotheses of Theorem 14 are satisfied 
whenever the input language is accepted by a WFA. First, a progress measure 
is given by the dimension of the vector space generated by the table. To see 
this, note that if we distinguish two linear combinations of rows, we can assume 
without loss of generality that one of these linear combinations in the extended 
table uses only basis elements. This in turn can be rewritten to distinguishing 
a single row from a linear combination of rows using field operations, with the 
property that the extended version of the single row is a basis element. Hence, the 
row was not a basis element in the original table, and therefore the dimension of 
the vector space generated by the table has increased. Adding rows and columns 
cannot decrease this dimension, so it is bounded by the dimension of the Hankel 
matrix. Since the language we want to learn is accepted by a WFA, the associated 
Hankel matrix has a finite dimension [10,12] (see also, e.g., [5]), providing a 
bound for our progress measure. 

Finally, for any ascending chain of subspaces of the Hankel matrix, these 
subspaces are of finite dimension bounded by the dimension of the Hankel matrix. 
The dimension increases along a strict subspace relation, so the chain converges. 


Example 16 (Weighted languages over finite semirings). Consider any finite semir- 
ing. Finiteness allows us to apply a brute force approach to solving systems of 
equations. This means the semiring is solvable, and hence a closedness strategy 
exists by Proposition 10. 

For termination, we can define a progress measure by assigning to each table 
the size of the image of row’. Distinguishing two linear combinations of rows 


6 We note that [13] assumes the monad to preserve finite sets. However, the relevant 
arguments do not depend on this. 
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increases this measure. If the language we want to learn is accepted by a WFA, 
then the Hankel matrix contains a subset of the linear combinations of the lan- 
guages of its states. Since there are only finitely many such linear combinations, 
the Hankel matrix is finite, which bounds our measure. A finite semimodule such 
as the Hankel matrix in this case does not admit infinite chains of subspaces. 
We conclude by Theorem 14 that Algorithm 1 terminates for the instance that 
the semiring S is a finite, if the input language is accepted by a WFA over S. 


For the Boolean semiring, an instance of the above finite semiring example, 
WFAs are non-deterministic finite automata. The algorithm we recover by in- 
stantiating Algorithm 1 to this case is close to the algorithm first described by 
Bollig et al. [8]. The main differences are that in their case the hypothesis has 
a state space given by a minimally generating subset of the distinct rows in the 
table rather than all elements of S, and they do apply a notion of consistency. 

In Section 6 we will show that Algorithm 1 can learn WFAs over principal 
ideal domains—notably including the integers—thus providing a strict general- 
isation of existing techniques. 


5 Issues with Arbitrary Semirings 


We concluded the previous section with examples of semirings for which Algo- 
rithm 1 terminates if the target language is accepted by a WFA. In this section, 
we prove a negative result for the algorithm over the semiring N: we show that 
it does not terminate on a certain language over N accepted by a WFA over N, 
as anticipated in Section 2.2. This means that Algorithm 1 does not work well 
for arbitrary semirings. The problem is that the Hankel matrix of a language 
recognised by WFA does not necessarily satisfy the ascending chain condition 
that is used to prove Theorem 14. In the example given in the proof below, the 
Hankel matrix is not even finitely generated. 


Theorem 17. There exists a WFA Ay over N such that Algorithm 1 does not 
terminate when given La, as input, regardless of the closedness strategy used. 


Proof. Let Ay be the automaton over the alphabet {a} given in (1) in Section 2.2. 
Formally, Ay = (Q, ô, i, 0), where 


Q = {q0 q1} i = qo 


(qo) =0 
6(qo)(@) = qo + qı 6(q1)(a) = 21 qa) =1. 


(n) = 

As mentioned in Section 2.2, the language £L: {a}* — N accepted by Ay is 
given by L(a) = 2/ — 1. This can be shown more precisely as follows. First one 
shows by induction on j that obs.4,,(q1)(a7) = 2/ for all 7 € N—we leave the 
straightforward argument to the reader. Second, we show, again by induction on 
j, that obs.4,,(qo)(a7) = 2f — 1. This implies the claim, as £ = obs.4,,(qo). For 
j = 0 we have obs.4,,(qo)(a?) = o(qo) = 0 = 2° — 1 as required. For the inductive 


oO 
oO 
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step, let j = k +1 and assume obs,4,,(qo)(a*) = 2* — 1. We calculate 


obs Ay (go)(a**") = ObSAy (qo ar n)(a*) 

obs. 4x (qo) (a*) + obsa (q1)(a") 
= (2° —1)+ 2° 

=i 1, 


Note that in particular the language £ is injective. 

Towards a contradiction, suppose the algorithm does terminate with table 
(S,£). Let J = {j € N | af € S} and define n = max(J). Since the algorithm 
terminates with table (S, E), the latter must be closed. In particular, there exist 
kj € N for all j € J such that >7 27k; - rowc(a/) = srowc(a”"a). We consider 
two cases. First assume E = {e} and let A = (Q’,0’,i’,0') be the hypothesis. 
For all 1 € N we have row’, (reach! (a!))(e) = 2! — 1 because A must be correct. 
Thus, if a! € S- A, then row’. (reach! (a!)) = srowz(a!). In particular, 

row’, (reach, (a”a)) = = srowc(a”"a) =X kj. - rowg (a). 
JET 


Note that we can choose the kj such that reach, (aa) = Jc; kj- af. Since 


jEJ 


row! | 6" kj -a | (a) | = row? kj + 6'(a?)(a) 
£ j k í 


jEJ jEJ 
= Y` kj  rowc(ð' (at) (a)) 
jEJ 
= 5 k; + srowg (aa), 
jEJ 


we have row’, (reach! (a”aa)) = Vics ki: srowr (afa) and therefore 


> kj + srowg(afa)(e) = row’, (reach! (a”aa))(€) =2+2_], 
jEJ 


Then 


NFA l=) ky - srowc (a? a) = Sok; ( (Qi** — 


jet jet 
=2| 5 > j(2?-1)] +X k; = 20077 -1) +$ kj, 
jeJ jEJ jEJ 


so ee kj = 1. This is only possible if there is jı € J s.t. kj, = 1 and k; = 0 
for all j € J \ {j1}. However, this implies that rows (a?!) = srowc(a"a), which 
contradicts injectivity of £ as n > jı. Thus, the algorithm did not terminate. 


Learning Weighted Automata over PIDs 615 


For the other case, assume there is a™ € E such that m > 1. We have 
2+1 _ 1 = srowz(a"a) =X kj- - rows (af) ( =X kj% 
jEJ jEJ 


SO 


XO k(t — 1) = X ky + rowc (a) (a™) 


ied jEJ 
= srowz (a”a)(a™) 
= gntm+1 = 1 


= age = 1) 4 gm _ 1 


II 
N 
= 
3 
> 

S. 
Ze 
© 
S 

l 
H 
= 
+ 
N 
3 

l 
= 


= 5 kj (29+™ _ am) 4 gm _ 4 
jEJ 


=|X y- ] +X ya- 2) +21. 
jEJ jEJ 


Then 


Since m > 1 this is only possible if there is jı € J s.t. kj, = 1 and kj = 0 
for all j € J \ {j1}. However, this implies rowc (a!) = srowc(a"a), which again 
contradicts injectivity of £ as n > jı. Thus, the algorithm did not terminate. 


Remark 18. Our proof shows non-termination for a bigger class of algorithms 
than Algorithm 1; it uses only the definition of the hypothesis, that closedness 
is satisfied before constructing the hypothesis, that S and E contain the empty 
word, and that termination implies correctness. For instance, adding the prefixes 
of a counterexample to S instead of its suffixes to Æ will not fix the issue. 


We have thus shown that our algorithm does not instantiate to a terminating 
one for an arbitrary semiring. To contrast this negative result, in the next section 
we identify a class of semirings not previously explored in the learning literature 
where we can guarantee a terminating instantiation. 


6 Learning WFAs over PIDs 


We show that for a subclass of semirings, namely principal ideal domains (PIDs), 
the abstract learning algorithm of Section 4 terminates. This subclass includes 
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the integers, Gaussian integers, and rings of polynomials in one variable with 
coefficients in a field. We will prove that the Hankel matrix of a language over 
a PID accepted by a WFA has analogous properties to those of vector spaces— 
finite rank, a notion of progress measure, and the ascending chain condition. We 
also give a sufficient condition for PIDs to be solvable, which by Proposition 10 
guarantees the existence of a closedness strategy for the learning algorithm. 

To define PIDs, we first need to introduce ideals. Given a ring S, a (left) ideal 
I of S is an additive subgroup of S s.t. for all s € S andi € I we have si € I. 
The ideal I is (left) principal if it is of the form J = Ss for some s € S. 


Definition 19 (PID). A principal ideal domain P is a non-zero commutative 
ring in which every ideal is principal and where for all pı,pọ E€ P such that 
Pip. = 0 we have pı = 0 or pp = 0. 


A module M over a PID P is called torsion free if for all p € P and any 
m E€ M such that p-m = 0 we have p = 0 or m = 0. It is a standard result that 
a module over a PID is torsion free if and only if it is free [17, Theorem 3.10]. 

The next definition of rank is analogous to that of the dimension of a vector 
space and will form the basis for the progress measure. 


Definition 20 (Rank). We define the rank of a finitely generated free module 
V(X) over a PID as rank(V(X)) = |X]. 


This definition extends to any finitely generated free module over a PID, as 
V(X) =V(Y) for finite sets X and Y implies |X| = |Y | [17, Theorem 3.4]. 

Now that we have a candidate for a progress measure function, we need to 
prove it has the required properties. The following lemmas will help with this. 


Lemma 21. Given finitely generated free modules M, N over a PID s.t. rank(M) > 
rank(N), any surjective module homomorphism f: N —> M is injective. 


Proof. Since rank(M) > rank(N), there exists a surjective module homomor- 
phism g: M — N. Therefore go f: N —> N is surjective and by [23] an iso. In 
particular, f is injective. 


Lemma 22. If M and N are finitely generated free modules over a PID such 
that there exists a surjective module homomorphism f: N —> M, then rank(M) < 
rank( N). If f is not injective, then rank(M) < rank(N). 


Proof. Let f: N + M be a surjective module homomorphism. Suppose towards 
a contradiction that rank(M) > rank(V). By Lemma 21 f is injective, so M is 
isomorphic to a submodule of N and rank( M) < rank(N) [17]; contradiction. 
For the second part, suppose f is not injective and assume towards a contra- 
diction that rank(M) > rank(N). Again by Lemma 21 f is injective, which is a 
contradiction with our assumption. Thus, in this case rank(M) < rank( N). 


The lemma below states that the Hankel matrix of a weighted language over 
a PID has finite rank which bounds the rank of any module generated by an 
observation table. This will be used to define a progress measure, used to prove 
termination of the learning algorithm for weighted languages over PIDs. 
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Lemma 23 (Hankel matrix rank for PIDs). When targeting a language 
accepted by a WFA over a PID, any module generated by an observation table 
is free. Moreover, the Hankel matrix has finite rank that bounds the rank of any 
module generated by an observation table. 


Proof. Given a WFA A = (Q,6,%,0), let M be the free module generated by 
Q. Note that the Hankel matrix is the image of the composition obs4 o reach 4. 
Consider the image of the module homomorphism reach,: V(A*) + M, which 
we write as R. Since R is a submodule of M, we know from [17] that R is free 
and finitely generated with rank(R) < rank(M). The Hankel matrix can now be 
obtained as the image of the restriction of obs4: M — S^” to the domain R. 
Let H be this image, which we know is finitely generated because R is. Since 
H is a submodule of the torsion free module S”, it is also torsion free and 
therefore free. We also have a surjective module homomorphism s: R + H, so 
by Lemma 22 we find rank(H) < rank(R). 

Let N be the module generated by an observation table (S, Æ). We have that 
N is a quotient of the module generated by (S, A*), which in turn is a submodule 
of H. Using again [17] and Lemma 22 we conclude that N is free and finitely 
generated with rank(N) < rank(#). 


The second part of Lemma 23 would follow from a PID variant of Fliess’ theo- 
rem [12]. We are not aware of such a result, and leave this for future work. 


Proposition 24 (Progress measure for PIDs). There exists a progress 
measure for any language accepted by a WFA over a PID. 


Proof. Define size(S, E) = rank( M), where M is the module generated by the 
table (S, E). By Lemma 23 this is bounded by the rank of the Hankel matrix. If 
M and N are modules generated by two tables such that N is a strict quotient 
of M, then by Lemma 22 we have rank(M) > rank(NV). 


Recall that, for termination of the algorithm, Theorem 14 requires a progress 
measure, which we defined above, and it requires the Hankel matrix of the lan- 
guage to satisfy the ascending chain condition (Definition 13). Proposition 25 
shows that the latter is always the case for languages over PIDs. 


Proposition 25 (Ascending chain condition PIDs). The Hankel matrix of 
a language accepted by a WFA over a PID satisfies the ascending chain condition. 


Proof. Let H be the Hankel matrix, which has finite rank by Lemma 23. If 
Mı C Mz C M3 C:-: 


is any chain of submodules of H, then M = Uj,cx Mi is a submodule of H and 
therefore also of finite rank [17]. Let B be a finite basis of M. There exists n € N 
such that B C Mn, so Mn = M. 


The last ingredient for the abstract algorithm is solvability of the semiring: 
the following fact provides a sufficient condition for a PID to be solvable. 
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Proposition 26 (PID solvability). A PID P is solvable if all of its ring 
operations are computable and if each element of P can be effectively factorised 
into irreducible elements. 


Proof. It is well-known that a system of equations of the form Ax = b with 
integer coefficients can be efficiently solved via computing the Smith normal 
form [25] of A. The algorithm generalises to principal ideal domains, if we assume 
that the factorisation of any given element of the principal ideal domain’ into 
irreducible elements is computable, cf. the algorithm in [16, p. 79-84]. To see 
that all steps in this algorithm can be computed, one has to keep in mind that 
the factorisation can be used to determine the greatest common divisor of any 
two elements of the principal ideal domain. 


Remark 27. In the case that we are dealing with an Euclidean domain P, a 
sufficient condition for P to be solvable is that Euclidean division is computable 
(again this can be deduced from inspecting the algorithm in [16, p. 79-84]). Such 
a PID behaves essentially like the ring of integers. 


Putting everything together, we obtain the main result of this section. 


Theorem 28 (Termination for PIDs). Algorithm 1 can be instantiated and 
terminates for any language accepted by a WFA over a PID of which all ring 
operations are computable and of which each element can be effectively factorised 
into irreducible elements. 


Proof. To instantiate the algorithm, we need a closedness strategy. According 
to Proposition 10 it is sufficient for the PID to be solvable, which is shown by 
Proposition 26. Proposition 24 provides a progress measure, and we know from 
Proposition 25 that the Hankel matrix satisfies the ascending chain condition, 
so by Theorem 14 the algorithm terminates. 


The example run given in Section 2.1 is the same when performed over the 
integers. We note that if the teacher holds an automaton model of the correct 
language, equivalence queries are decidable by lifting the embedding of the PID 
into its quotient field to the level of WFAs and checking equivalence there. 


7 Discussion 


We have introduced a general algorithm for learning WFAs over arbitrary semir- 
ings, together with sufficient conditions for termination. We have shown an inher- 
ent termination issue over the natural numbers and proved termination for PIDs. 
Our work extends the results by Bergadano and Varricchio [7], who showed that 
WFAs over fields could be learned from a teacher. Although we note that a PID 
can be embedded into its corresponding field of fractions, the WFAs produced 
when learning over the field potentially have weights outside the PID. 


T Note that factorisations exist as each principal ideal domain is also a unique factori- 
sation domain, cf. e.g. [17, Thm. 2.23]. 
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Algorithmic issues with WFAs over arbitrary semirings have been identified 
before. For instance, Krob [18] showed that language equivalence is undecidable 
for WFAs over the tropical semiring. 

On the technical level, a variation on WFAs is given by probabilistic au- 
tomata, where transitions point to convex rather than linear combinations of 
states. One easily adapts the example from Section 5 to show that learning 
probabilistic automata has a similar termination issue. On the positive side, 
Tappler et al. [26] have shown that deterministic MDPs can be learned using an 
L* based algorithm. The deterministic MDPs in loc.cit. are very different from 
the automata in our paper, as their states generate observable output that allows 
to identify the current state based on the generated input-output sequence. 

One drawback of the ascending chain condition on the Hankel matrix is 
that this does not give any indication of the number of steps the algorithm 
requires. Indeed, the submodule chains traversed, although converging, may be 
arbitrarily long. We would like to measure and bound the progress made when 
fixing closedness defects, but this turns out to be challenging for PIDs. The rank 
of the module generated by the table may not increase. We leave an investigation 
of alternative measures to future work. 

We would also like to adapt the algorithm so that for PIDs it always pro- 
duces minimal automata. At the moment this is already the case for fields,’ 
since adding a row due to a closedness defect preserves linear independence of 
the image of row. For PIDs things are more complicated—adding rows towards 
closedness may break linear independence and thus a basis needs to be found in 
row*. This complicates the construction of the hypothesis. 

Our results show that, on the one hand, WFAs can be learned over finite 
semirings and arbitrary PIDs (assuming computability of the relevant opera- 
tions) and, on the other hand, that there exists an infinite commutative semiring 
for which they cannot be learned. However, there are many classes of semirings 
in between commutative semirings and PIDs, of which we would like to know 
whether their WFAs can be learned by our general algorithm. 

Finally, we would like to generalise our results to extend the framework in- 
troduced in [13], which focusses on learning automata with side-effects over a 
monad. WFAs as considered in the present paper are an instance of those, where 
the monad is the free semimodule monad V(—). At the moment, the results 
in [13] apply to a monad that preserves finite sets, but much of our general 
WEA learning algorithm and termination argument can be extended to that set- 
ting. It would be interesting to see if crucial properties of PIDs that lead to a 
progress measure and to satisfying the ascending chain condition could also be 
translated to the monad level. 


Acknowledgments. We thank Joshua Moerman for comments and discussions. 


8 There is one exception: the language that assigns 0 to every word, which is accepted 
by a WFA with no states. The algorithm initialises the set of row labels, which 
constitute the state space of the hypothesis, with the empty word. 
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Abstract. Vector addition systems are an important model in theoret- 
ical computer science and have been used in a variety of areas. In this 
paper, we consider vector addition systems with states over a parame- 
terized initial configuration. For these systems, we are interested in the 
standard notion of computational time complexity, i.e., we want to un- 
derstand the length of the longest trace for a fixed vector addition system 
with states depending on the size of the initial configuration. We show 
that the asymptotic complexity of a given vector addition system with 
states is either O(N") for some computable integer k, where N is the 
size of the initial configuration, or at least exponential. We further show 
that k can be computed in polynomial time in the size of the considered 
vector addition system. Finally, we show that 1 < k < 2”, where n is the 
dimension of the considered vector addition system. 


1 Introduction 


Vector addition systems (VASs) [13], which are equivalent to Petri nets, are a 
popular model for the analysis of parallel processes [7]. Vector addition systems 
with states (VASSs) [10] are an extension of VASs with a finite control and are a 
popular model for the analysis of concurrent systems, because the finite control 
can for example be used to model shared global memory [12]. In this paper, we 
consider VASSs over a parameterized initial configuration. For these systems, 
we are interested in the standard notion of computational time complexity, i.e., 
we want to understand the length of the longest execution for a fixed VASS 
depending on the size of the initial configuration. VASSs over a parameterized 
initial configuration naturally arise in two areas: 1) The parameterized verifica- 
tion problem. For concurrent systems the number of system processes is often 
not known in advance, and thus the system is designed such that a template 
process can be instantiated an arbitrary number of times. The problem of ana- 
lyzing the concurrent system for all possible system sizes is a common theme in 
the literature [9,8,1,11,4,2,3]. 2) Automated complexity analysis of programs. 
VASSs (and generalizations) have been used as backend in program analysis 
tools for automated complexity analysis [18-20]. The VASS considered by these 
tools are naturally parameterized over the initial configuration, modelling the 
dependency of the program complexity on the program input. The cited papers 
have proposed practical techniques but did not give complete algorithms. 


© The Author(s) 2020 
J. Goubault-Larrecq and B. König (Eds.): FOSSACS 2020, LNCS 12077, pp. 622-641, 2020. 
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Two recent papers have considered the computational time complexity of 
VASSs over a parameterized initial configuration. [15] presents a PTIME pro- 
cedure for deciding whether a VASS is polynomial or at least exponential, but 
does not give a precise analysis in case of polynomial complexity. [5] establishes 
the precise asymptotic complexity for the special case of VASSs whose configura- 
tions are linearly bounded in the size of the initial configuration. In this paper, 
we generalize both results and fully characterize the asymptotic behaviour of 
VASSs with polynomial complexity: We show that the asymptotic complexity of 
a given VASS is either O(N*) for some computable integer k, where N is the 
size of the initial configuration, or at least exponential. We further show that 
k can be computed in PTIME in the size of the considered VASS. Finally, we 
show that 1 < k < 2”, where n is the dimension of the considered VASS. 


1.1 Overview and Illustration of Results 


We discuss our approach on the VASS Vrun, stated in Figure 1, which will serve 
as running example. The VASS has dimension 3 (i.e., the vectors annotating the 
transitions have dimension 3) and four states s1, 52, 3, $4. In this paper we will 
always represent vectors using a set of variables Var, whose cardinality equals 
the dimension of the VASS. For V,un we choose Var = {x,y,z} and use x,y,z 
as indices for the first, second and third component of 3-dimensional vectors. 
The configurations of a VASS are pairs of states and valuations of the variables 
to non-negative integers. A step of a VASS moves along a transition from the 
current state to a successor state, and adds the vector labelling the transition 
to the current valuation; a step can only be taken if the resulting valuation 
is non-negative. For the computational time complexity analysis of VASSs, we 
consider traces (sequences of steps) whose initial configurations consist of a val- 
uation whose maximal value is bounded by N (the parameter used for bounding 
the size of the initial configuration). The computational time complexity is then 
the length of the longest trace whose initial configuration is bounded by N. For 
ease of exposition, we will in this paper only consider VASSs whose control-flow 
graph is connected. (For the general case, we remark that one needs to decom- 
pose a VASS into its strongly-connected components (SCCs), which can then be 
analyzed in isolation, following the DAG-order of the SCC decomposition; for 
this, one slightly needs to generalize the analysis in this paper to initial configu- 
rations with values @(N**) for every variable x € Var, where k, € Z.) For ease 
of exposition, we further consider traces over arbitrary initial states (instead of 
some fixed initial state); this is justified because for a fixed initial state one can 
always restrict the control-flow graph to the reachable states, and then the two 
options result in the same notion of computational complexity (up to a constant 
offset, which is not relevant for our asymptotic analysis). 

In order to analyze the computational time complexity of a considered VASS, 
our approach computes variable bounds and transition bounds. A variable bound 
is the maximal value of a variable reachable by any trace whose initial configu- 
ration is bounded by N. A transition bound is the maximal number of times a 
transition appears in any trace whose initial configuration is bounded by N. For 
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Vrun, our approach establishes the linear variable bound O(N) for a and y, and 
the quadratic bound O(N?) for z. We note that because the variable bound of z 
is quadratic and not linear, V,yn, cannot be analyzed by the procedure of [5]. Our 
approach establishes the bound O(N) for the transitions sı —> s3 and s4 > s2, 
the bound O(N?) for transitions s4 > s2, S2 4 $1, 83 — $4, 84 > 83, and 
the bound O(N?) for all self-loops. The computational complexity of Vrun is 
then the maximum of all transition bounds, i.e., O(N?). In general, our main 
algorithm (Algorithm 1 presented in Section 4) either establishes that the VASS 
under analysis has at least exponential complexity or computes asymptotically 
precise variable and transition bounds @(N*), with k computable in PTIME and 
1<k <2”, where n is the dimension of the considered VASS. We note that our 
upper bound 2” also improves the analysis of [15], which reports an exponential 
dependence on the number of transitions (and not only on the dimension). 

We further state a family V, of VASSs, which illustrate that k can indeed 
be exponential in the dimension (the example can be skipped on first reading). 
Vn uses variables x; j and consists of states si j, for 1 < i < n and j = 1,2. We 
note that V, has dimension 2n. V, consists of the transitions 


— Sil 4 Si,2; for 1 < a < n, with d(xi1) = —] and d(x) = 0 for all x x Lil, 
— Si2 2 Si 1, for 1 < i < n, with d(x) = 0 for all z, 
= Si,1 x, Sil, for 1 < 1 < n, with d(x;,1) = —1, d(zi 2) = 1, d(xi41,1) = 
d(xi41,2) = 1 in case i < n, and d(x) = 0 for all other x, 
— Si2 EN Si2, for 1 < i < n, with d(x;1) = 1, d(x; 2) = —1, and d(x) = 0 for 
all other z, 
— Si x, Si+1,1; for 1 < TZ n, with d(xi1) = —] and d(x) = 0 for all x x ils 
— 8141.2 gn 8,2, for 1 < i < n, with d(x) = 0 for all x. 
Veop in Figure 1 depicts V,, for n = 3, where the vector components are stated in 
the order £11, 11,2, 2,1, 2,2, %3,1, 3,2. It is not hard to verify for all 1 <i < n 
that O(N?) is the precise asymptotic variable bound for x; ı and z; 2, that 
841 7 Si 2, 81,2 > Sit and Sil — Si41,1, Si+1,2 7 $i,2 in casei < n, and that 
O(N?) is the precise asymptotic transition bound for s;,; — Si, 1, Si2 > 51,2 
(Algorithm 1 can be used to find these bounds). 


1.2 Related Work 


A celebrated result on VASs is the EXPSPACE-completeness [16,17] of the 
boundedness problem. Deciding termination for a VAS with a fized initial con- 
figuration can be reduced to the boundedness problem, and is therefore also 
EXPSPACE-complete; this also applies to VASSs, whose termination problem 
can be reduced to the VAS termination problem. In contrast, deciding the termi- 
nation of VASSs for all initial configurations is in PTIME. It is not hard to see 
that non-termination over all initial configurations is equivalent to the existence 
of non-negative cycles (e.g., using Dickson’s Lemma [6]). Kosaraju and Sullivan 
have given a PTIME procedure for the detection of zero-cycles [14], which can be 
easily be adapted to non-negative cycles. The existence of zero-cycles is decided 
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Fig. 1. VASS Vrun (left) and VASS Verp (right) 


by the repeated use of a constraint system in order to remove transitions that 
can definitely not be part of a zero-cycle. The algorithm of Kosaraju and Sullivan 
forms the basis for both cited papers [15,5], as well as the present paper. 

A line of work [18-20] has used VASSs (and their generalizations) as backends 
for the automated complexity analysis of C programs. These algorithms have 
been designed for practical applicability, but are not complete and no theoretical 
analysis of their precision has been given. We point out, however, that these 
papers have inspired the Bound Proof Principle in Section 5. 


2 Preliminaries 


Basic Notation. For a set X we denote by |X| the number of elements of X. 
Let S be either N or Z. We write S! for the set of vectors over S indexed by 
some set I. We write S’*/ for the set of matrices over S indexed by J and J. 
We write 1 for the vector which has entry 1 in every component. Given a € S£, 
we write a(i) € S for the entry at line i € I of a, and |la|| = max;¢; |a(i)| for the 
maximum absolute value of a. Given a € S and J C I, we denote by al; € S7 
the restriction of a to J, i.e., we set al (i) = a(i) for alli € J. Given A € S/*/, 
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we write A(j) for the vector in column j € J of A and A(i,7) € S for the entry 
in column i € I and row j € J of A. Given A € SIX? and K C J, we denote 
by Al € S’** the restriction of A to K, i.e., we set Alx (i, j) = A(i, 7) for all 
(i,j) € Ix K. We write Id for the square matrix which has entries 1 on the 
diagonal and 0 otherwise. Given a,b € SŽ we write a+b € S! for component-wise 
addition, c-a € S! for multiplying every component of a by some c € S and 
a > b for component-wise comparison. Given A € S!*/, B e S7** and z € S’, 
we write AB € S!** for the standard matrix multiplication, Ax € S! for the 
standard matrix-vector multiplication, AT € S’*/ for the transposed matrix of 
A and zf €S!*/ for the transposed vector of x. 


Vector Addition System with States (VASS). Let Var be a finite set of variables. 
A vector addition system with states (VASS) V = (St(V), Trns(V)) consists 
of a finite set of states St(V) and a finite set of transitions Trns(V), where 
Trns(V) C St(V) x ZY" x St(V); we call n = | Var| the dimension of V. We write 


Sı x S2 to denote a transition (s1,d,52) E€ Trns(V); we call the vector d the 
update of transition s1 Sy s3. A path m of V is a finite sequence so Ai gi Ba as 


with s; ““s si41 € Trns(V) for all 0 < i < k. We define the length of 7 by 
length() = k and the value of m by val(™) = J icp yj di- Let instance(r, t) be 
the number of times 7 contains the transition t, i.e., the number of indices 7 such 
that t = s; & Si+1- We remark that length(™) = > te Tms(v) instance(7,t) for 
every path 7 of V. Given a finite path 7, and a path m2 such that the last state 
of mı equals the first state of 72, we write 7 = 772 for the path obtained by 
joining the last state of mı with the first state of 72; we call m the concatenation 
of mı and m2, and 7,72 a decomposition of 7. We say n’ is a sub-path of 7, if 
there is a decomposition 7 = mı n't for some 71,72. A cycle is a path that has 
the same start- and end-state. A multi-cycle is a finite set of cycles. The value 
val(M) of a multi-cycle M is the sum of the values of its cycles. V is connected, 
if for all s,s’ € St(V) there is a path from s to s’. VASS V’ is a sub- VASS of V, 
if St(V’) C St(V) and Trns(V’) C Trns(V). Sub-VASSs VY, and Vz are disjoint, 
if St(V1) O St(V2) = 0. A strongly-connected component (SCC) of a VASS V is a 
maximal sub-VASS S of V such that S' is connected and Trns(S) 4 0. 

Let V be a VASS. The set of valuations Val(V) = NY" consists of Var- 
vectors over the natural numbers (we assume N includes 0). The set of config- 
urations Cfg(V) = St(V) x Val(V) consists of pairs of states and valuations. 
A step is a triple ((s1, v1), d, (s2, v2)) € Cfg(V) x Z&™™) x Cfg(V) such that 
V = vı + d and sı 2 s2 € Trns(V). We write (81,11) x (S2,V2) to denote a 
step ((s1, v1), d, (S2, V2)) of V. A trace of V is a finite sequence Ç = (so, vo) a 
(51,1) Ee <+- (Sk, Vk) Of steps. We lift the notions of length and instances 
from paths to traces in the obvious way: we consider the path 7 = so a 
Sy LN ++ 8, that consists of the transitions used by Ç, and set length(¢) := 
length() and instance(¢,t) = instance(z,t), for all t € Trns(V). We denote 
by init(¢) = ||vo|| the maximum absolute value of the starting valuation vo 
of C. We say that Ç reaches a valuation v, if v = vz. The complexity of V is 


The Polynomial Complexity of VASS 627 


the function compy(N) = suPtrace ¢ of v,init(c)<w length(¢), which returns for 
every N > 0 the supremum over the lengths of the traces ¢ with init(¢) 
N. The variable bound of a variable x € Var is the function vbound, (NV) 
SUPtrace ¢ of V,init(¢)<N,¢ reaches valuation v v(x), which returns for every N > 
the supremum over the the values of x reachable by traces Ç with init(¢) < N. 
The transition bound of a transition t € Trns(V) is the function tbound;(N) = 
SUPtrace ¢ of V,init(¢)<N instance(¢,t), which returns for every N > 0 the supre- 
mum over the number of instances of t in traces Ç with init(¢) < N. 


< 
0 


Rooted Tree. A rooted tree is a connected undirected acyclic graph in which one 
node has been designated as the root. We will usually denote the root by +. We 
note that for every node 7 in a rooted tree there is a unique path of 7 to the root. 
The parent of a node ņ Æ is the node connected to 7 on the path to the root. 
Node 7 is a child of a node 7, if 7 is the parent of 7. 7 is a descendent of n, if 
7 lies on the path from 1 to the root; 7’ is a strict descendent, if furthermore 
n #7’. 7 is an ancestor of 1’, if n a descendent of 7; 7 is a strict ancestor, if 
furthermore 7 4 7’. The distance of a node 7 to the root, is the number of nodes 
Æ n on the path from 7 to the root. We denote by layer(l) the set of all nodes 
with the same distance l to the root; we remark that layer(0) = {+}. 


All proofs are presented in the extended version [21] for space reasons. 


3 A Dichotomy Result 


We will make use of the following matrices associated to a VASS throughout 
the paper: Let V be a VASS. We define the update matrix D € ZV9"* Trns(V) by 
setting D(t) = d for all transitions t = (s,d,s’) € Trns(V). We define the flow 
matric F € Z5(v)xTrnsV) by setting F(s,t) = —1, F(s',t) = 1 for transitions 
t = (s,d,s') with s Æ s, and F(s,t) = F(s’,t) = 0 for transitions t = (s, d, s”) 
with s’ = s; in both cases we further set F'(s”,¢) = 0 for all states s” with s” 4 s 
and s” Æ s’. We note that every column t of F either contains exactly one —1 
and 1 entry (in case the source and target of transition t are different) or only 0 
entries (in case the source and target of transition t are the same). 


Example 1. We state the update and flow matrix for V,.,, from Section 1: 
11-1100 0 0-10 0000 1-10 0-10 


D=| 1 —11-—10 0 0 0 U0) FH ee wae ERE 
-11 1 -1-1-1-1-100 


0000 0 0-11 0-1 
with column order sı — 81, S2 — S2, 53 — S3, 84 + S4, S2 > S1, S1 > S2, 
S4 — 83, S3 — S4, S1 + $3, S4 — S2 (from left to right) and row order 2, y, z for 
D resp. s1, S2, S3, S4 for F (from top to bottom). 


We now consider the constraint systems (P) and (Q), stated below, which 
have maximization objectives. The constraint systems will be used by our main 
algorithm in Section 4. We observe that both constraint systems are always sat- 
isfiable (set all coefficients to zero) and that the solutions of both constraint 
systems are closed under addition. Hence, the number of inequalities for which 
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the maximization objective is satisfied is unique for optimal solutions of both 
constraint systems. The maximization objectives can be implemented by suit- 
able linear objective functions. Hence, both constraint systems can be solved 
in PTIME over the integers, because we can use linear programming over the 
rationales and then scale rational solutions to the integers by multiplying with 
the least common multiple of the denominators. 


constraint system (P): constraint system (Q): 
there exists u € ZT”) with there exist r € ZY", z € ZY) with 

Du>0 r>o0 

H20 z>0 

Fu=0 Diy +f 2 <0 
Maximization Objective: Maximization Objective: 
Maximize the number of inequalities|Maximize the number of inequalities 
with (Dy)(x) > 0 and p(t) > 0 with r(x) > 0 and (D?r + F™z)(t) <0 


The solutions of (P) and (Q) are characterized by the following two lemmata: 


Lemma 2 (Cited from [14]). u € Z7’™) is a solution to constraint sys- 
tem (P) iff there exists a multi-cycle M with val(M) > 0 and u(t) instances of 
transition t for every t € Trns(V). 


Lemma 3 (Cited from [5]!). Let r,z be a solution to constraint system (Q). 
Let rank(r,z) : Cfg(V) — N be the function defined by rank(r, z)(s,v) = r?v + 
z(s). Then, rank(r, z) is a quasi-ranking function for V, i.e., we have 


1. for all (s,v) € Cfg(V) that rank(r, z)(s,v) > 0; 


2. for all transitions t = sı 4 gy Trns(V) and valuations 11,v2 € Val(V) 
with vo = vı +d that rank(r,z)(s1, v1) > rank(r, z)(s2,V2); moreover, the 
inequality is strict for every t with (D?r + F™z)(t) < 0. 


We now state a dichotomy between optimal solutions to constraint sys- 
tems (P) and (Q), which is obtained by an application of Farkas’ Lemma. This 
dichotomy is the main reason why we are able to compute the precise asymptotic 
complexity of VASSs with polynomial bounds. 


1 There is no explicit lemma with this statement in [5], however the lemma is implicit 
in the exposition of Section 4 in [5]. We further note that [5] does not include the 
constraint z > 0. However, this difference is minor and was added in order to ensure 
that ranking functions always return non-negative values, which is more standard 
than the choice of [5]. A proof of the lemma can be found in the extended version [21]. 


The Polynomial Complexity of VASS 629 


Lemma 4. Letr and z be an optimal solution to constraint system (Q) and let u 
be an optimal solution to constraint system (P). Then, for all variables x € Var 
we either have r(x) > 0 or (Du)(x) > 1, and for all transitions t € Trns(V) we 
either have (D7 r + F? z)(t) <0 or u(t) > 1. 


Example 5. Our main algorithm, Algorithm 1 presented in Section 4, will di- 
rectly use constraint systems (P) and (Q) in its first loop iteration, and adjusted 
versions in later loop iterations. Here, we illustrate the first loop iteration. We 
consider the running example V,4,, whose update and flow matrices we have 
stated in Example 1. An optimal solution to constraint systems (P) and (Q) is 
given by u = (1441111100)? and r = (220)7, z = (0011)7. The quasi-ranking 
function rank(r,z) immediately establishes that tbound,(N) € O(N) for t = 
Sı > s3 and t = s4 > s2, because 1) rank(r, z) decreases for these two transitions 
and does not increase for other transitions (by Lemma 3), and because 2) the ini- 
tial value of rank(r, z) is bounded by O(N), i.e., we have rank(r, z)(s,v) € O(N) 
for every state s € St(Vun) and every valuation v with ||v|| < N. By a simi- 
lar argument we get vbound,(N) € O(N) and vbound,(V) € O(N). The exact 
reasoning for deriving upper bounds is given in Section 5. From u we can, by 
Lemma 2, obtain the cycles C = s1 > S2 > S2 > S2 > S2 > S2 > 8, > Sı and 
C2 = 83 > 84 > 84 > 84 > 84 > 84 > 84 > s4 with v(C1) + v(C2) > (001)? 
(*). We will later show that the cycles Cı and C2 give rise to a family of traces 
that establish tbound;(N) € Q(N?) for all transitions t € Trns(Vrun) with 
t Æ sı > s3 and t Æ s4 > s2. Here we give an intuition on the construction: We 
consider a cycle C of Vun that visits all states at least once. By (*), the updates 
along the cycles Cı and C2 cancel each other out. However, the two cycles are 
not connected. Hence, we execute the cycle Cı some (N) times, then (a part 
of) the cycle C, then execute C2 as often as C1, and finally the remaining part 
of C; this we repeat (2(N) times. This construction also establishes the bound 
vbound,(N) € 2(N7) because, by (*), we increase z with every joint execution 
of Cı and C2. The precise lower bound construction is given in Section 6. 


4 Main Algorithm 


Our main algorithm — Algorithm 1 — computes the complexity as well as variable 
and transition bounds of an input VASS V, either detecting that V has at least 
exponential complexity or reporting precise asymptotic bounds for the transi- 
tions and variables of V (up to a constant factor): Algorithm 1 will compute 
values vExp(x) € N such that vboundy (x) € O(NY®*P()) for every x € Var and 
values tExp(t) € N such that tboundy(t) € O(N*®*P() for every t € Trns(V). 


Data Structures. The algorithm maintains a rooted tree T. Every node 7 of T 
will always be labelled by a sub-VASSs VASS(7) of V. The nodes in the same 
layer of T will always be labelled by disjoint sub-VASS of V. The main loop of 
Algorithm 1 will extend T by one layer per loop iteration. The variable l always 
contains the next layer that is going to be added to T. For computing variable and 
transition bounds, Algorithm 1 maintains the functions vExp : Var > NU {oo} 
and tExp: Trns(V) > NU {co}. 
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Initialization. We assume D to be the update matrix and F to be the flow 
matrix associated to V as discussed in Section 3. At initialization, T consists of 
the root node z and we set VASS(v) = V, i.e., the root is labelled by the input V. 
We initialize l = 1 as Algorithm 1 is going to add layer 1 to T in the first loop 
iteration. We initialize vExp(a) = oo for all variables x € Var and tExp(t) = oo 
for all transitions t € Trns(V). 


The constraint systems solved during each loop iteration. In loop iteration J, 
Algorithm 1 will set tExp(t) := l for some transitions t and vExp(x) := l for 
some variables x. In order to determine those transitions and variables, Algo- 
rithm 1 instantiates constraint systems (P) and (Q) from Section 3 over the set 
of transitions U = U,crayeryi—1) 27ns(VASS(7)), which contains all transitions 
associated to nodes in layer l — 1 of T. However, instead of a direct instantiation 
using D|y and Fy (i.e., the restriction of D and F to the transitions U), we 
need to work with an extended set of variables and an extended update matrix. 
We set Var eat := {(x,7) | n € layer(l — vExp(x))}, where we set n — co = 0 for 
all n € N. This means that we use a different copy of variable x for every node 
7 in layer l — vExp(x). We note that for a variable x with vExp(«) = oo there is 
only a single copy of x in Vare,, because 4 € layer(0) is the only node in layer 
0. We define the extended update matrix Dert E€ ZV"**" by setting 


_ f D(z, t), if t € Trns(VASS(n)), 
Dext((Z, 7), t) = { 0, otherwise. 


Constraint systems (J) and (JI) stated in Figure 2 can be recognized as in- 
stantiation of constraint systems (P) and (Q) with matrices Dest and F|y and 
variables Var ez, and hence the dichotomy stated in Lemma 4 holds. 

We comment on the choice of Var ext: Setting Var est = {(x,7) | n € layer(7)} 
for any i < | — vExp(x) would result in correct upper bounds (while i > J — 
vExp(x) would not). However, choosing i < l — vExp(x) does in general result in 
sub-optimal bounds because fewer variables make constraint system (J) easier 
and constraint system (JZ) harder to satisfy (in terms of their maximization 
objectives). In fact, ¢ = | — vExp(«) is the optimal choice, because this choice 
allows us to prove corresponding lower bounds in Section 6. We will further 
comment on key properties of constraint systems (I) and (JI) in Sections 5 
and 6, when we outline the proofs of the upper resp. lower bound. 

We note that Algorithm 1 does not use the optimal solution jz to constraint 
system (J) for the computation of the vExp(a) and tExp(t), and hence the com- 
putation of the optimal solution u could be removed from the algorithm. The 
solution u is however needed for the extraction of lower bounds in Sections 6 
and 8, and this is the reason why it is stated here. The extraction of lower bounds 
is not explicitly added to the algorithm in order to not clutter the presentation. 


Discovering transition bounds. After an optimal solution r, z to constraint sys- 
tem (JI) has been found, Algorithm 1 collects all transitions t with (Dr + 
F\j-z)(t) < 0 in the set R (note that the optimization criterion in constraint 
system (JI) tries to find as many such ¢ as possible). Algorithm 1 then sets 
tExp(t) := l for all t € R. The transitions in R will not be part of layer l of T. 
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Input: a connected VASS VY with update matrix D and flow matrix F 
T := single root node z with VASS(z) = V; 

ale 

vExp(x) := oo for all variables x € Var; 

tExp(t) := oo for all transitions t € Trns(V); 


let U := Unerayer(i—1) Prns(VASS()); 


let Dert € ZVetXU be the matrix defined by 
D(a,t), if t € Trns(VASS 
Dex ((2,7),t) = { T i o me ; 

find optimal solutions u and r, z to constraint systems (J) and (II); 
let R:={t € U | (Diyr + FlGz)(t) < 0}; 
set tExp(t) := l for all t € R; 
foreach 7 € layer(l — 1) do 
let V’ := VASS(7) be the VASS associated to n; 
decompose (St(V’), Trns(V’) \ R) into SCCs; 
foreach SCC S of (St(V'), Trns(V’) \ R) do 

| create a child 7’ of 1 with VASS(n') = S; 


foreach x € Var with vExp(x) = œ do 
| if r(x,) > 0 then set vExp(x) := l ; 


|. return “Y has at least exponential complexity” 


l:=141; 
until vExp(x) 4 co and tExp(t) 4 co for all x € Var and t € Trns(V); 


let Varert := {(x,7) | n E€ layer(l — vExp(x))}, where n — co = 0 for n EN; 


if there are no x € Var, t € Trns(V) with | < vExp(x) + tExp(t) < oo then 


Algorithm 1: Computes transition and variable bounds for a VASS Y 


constraint system (J): constraint system (JJ): 
there exists u € ZY with there exist r € Zet z e Z”) with 

DerthH > 0 r > 0 

p20 z>0 

Flu =0 Dlar + Flpz <0 
Maximization Objective: Maximization Objective: 
Maximize the number of inequalities|Maximize the number of inequalities with 
with (Destts)(x) > 0 and p(t) > 0 r(x,n) > 0 and (Dir + F\fz)(t) <0 


Fig. 2. Constraint Systems (J) and (JI) used by Algorithm 1 


Construction of the next layer in T. For each node 7 in layer l— 1, Algorithm 1 
will create children by removing the transitions in R. This is done as follows: 
Given a node ņ in layer l — 1, Algorithm 1 considers the VASS V’ = VASS(7) 
associated to 7. Then, (St(V’), Trns(V’)\R) is decomposed into its SCCs. Finally, 
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for each SCC S of (St(V’), Trns(V’)\ R) a child n’ of 7 is created with VASS(7’) = 
S. Clearly, the new nodes in layer l are labelled by disjoint sub-VASS of V. 


The transitions of the next layer. The following lemma states that the new layer 
l of T contains all transitions of layer | — 1 except for the transitions R; the 
lemma is due to the fact that every transition in U \ R belongs to a cycle and 
hence to some SCC that is part of the new layer L. 


Lemma 6. We consider the new layer constructed during loop iteration | of 
Algorithm 1: we have U \ R = Unerayer(1) Tros (VASS(n)). 

Discovering variable bounds. For each x € Var with vExp(x) = co, Algorithm 1 
checks whether r(z,v) > 0 (we point out that the optimization criterion in 
constraint systems (JI) tries to find as many such z with r(x,v) > 0 as possible). 
Algorithm 1 then sets vExp(x) := / for all those variables. 


The check for exponential complexity. In each loop iteration, Algorithm 1 checks 
whether there are x € Var, t € Trns(V) with | < vExp(x) + tExp(t) < oo. If 
this is not the case, then we can conclude that V is at least exponential (see 
Theorem 9 below). If the check fails, Algorithm 1 increments l and continues 
with the construction of the next layer in the next loop iteration. 


Termination criterion. The algorithm proceeds until either exponential complex- 
ity has been detected or until vExp(a) 4 œo and tExp(t) 4 oo for all x € Var and 
t € Trns(V) (i.e., bounds have been computed for all variables and transitions). 


Invariants. We now state some simple invariants maintained by Algorithm 1, 
which are easy to verify: 


— For every node 7 that is a descendent of some node 7’ we have that VASS(7) 
is a sub-VASS of VASS(7’). 

— The value of vExp and tExp is changed at most once for each input; when 
the value is changed, it is changed from oo to some value Æ ov. 

— For every transition t € Trns(V) and layer | of T, we have that either 
tExp(t) < l or there is a node 7 € layer(l) such that t € Trns(VASS(7)). 

— We have tExp(t) = l for t € Trns(V) if and only if there is a 7 € layer(l— 1) 
with t € Trns(VASS(7)) and there is no 7 € layer(l) with t € Trns(VASS(n)). 


Example 7. We sketch the execution of Algorithm 1 on Vrun. In iteration l = 1, 
we have Vareste = {(x, 1), (y, t), (z,e)}, and thus matrix Dex, is identical to the 
matrix D. Hence, constraint systems (J) and (JZ) are identical to constraint sys- 
tems (P) and (Q), whose optimal solutions u = (1441111100)? and r = (220)7, 
z = (0011)7 we have discussed in Example 5. Algorithm 1 then sets tExp(sı > 
s3) = 1 and tExp(s4 > s2) = 1, creates two children 74 and ng of labeled by 
Va = ({81, 82}, {81 > $1, $1 > $2, S2 > $2, S2 > S1 }) and Vg = ({s3, s4}, {83 9 
83,53 —> $4,84 — 4,84 — s3}), and sets vExp(#) = 1 and vExp(y) = 1. In 
iteration | = 2, we have Vares = {(£,na), (y, 7a), (£, nB), (¥, 7B), (z,4)} and 
the matrix Dert stated in Figure 3. Algorithm 1 obtains = (11110000) and 
r = (12211)7, z = (0000)7 as optimal solutions to (J) and (JI). Algorithm 1 then 
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—1]1 1 0 0 0 0 0 0 —1 0 0 0\ with column order 

1-10 0 0 0 0 0 1 0 0 0] s > s1, 82 > 882, 
Det =| 0 0-11 0 0 0 0 0 1 0 0 | s3 > 83, s4 > Sa, 

0 0 1-10 0 0 0 0 —1 0 0 | (from left to right) 

=1 1 1 -1-1-1-1-1 D.. = 0 0 —1 0 | and row order 
with column order sı —> sı, s2 > s2, =- |0 010 (x, m), (y, m), (£, n2), 
S83 —> 83; 54 — S4; 82 —> S1, $1 — S72, 0 0 0 1 (y, n2), (x£, n3), (Y; n3), 
S4 — 83, S3 — s4 (from left to right) 0 0 0 —1 | (#,n4), (y, na), (2,14), 
and row order (x, ņa), (y, na), (£,nB), —1 1 0 0 | (2,ne) (from top to 
(y, ne), (z, L) (from top to bottom) 0 0 1 —1/ bottom) 


Fig. 3. The extended update matrices during iteration l = 2 (left) and l = 3 (right) of 
Algorithm 1 on the running example Vrun from Section 1. 


sets tExp(s; > s2) = tExp(s2 > s1) = tExp(s3 > s4) = tExp(s4 > s3) = 2, 
creates the children 7,72 resp. 73,74 of na resp. ng with m; labelled by V; = 
({si}, {si > si}), and sets vExp(z) = 2. In iteration l = 3, we have Var est = 
{(2, m), (y, m), (x, n2), (y, n2), (x, ns), (y, ns), (x, na), (y, na), (z, na), (z, nB)} and 
the matrix Dert stated in Figure 3. Algorithm 1 obtains = (0000)? and 
r = (1113311111), z = (0000)7 as optimal solutions to (I) and (JZ). Algo- 
rithm 1 then sets tExp(s; > s;) = 3, for all į, and terminates. 


We now state the main properties of Algorithm 1: 
Lemma 8. Algorithm 1 always terminates. 


Theorem 9. If Algorithm 1 returns “V has at least exponential complexity”, 
then comp,,(N) € 2°), and we have tbound,(N) € 2%) for allt € Trns(V) 
with tExp(t) = œ and vbound,(N) € 2°) for all x € Var with vExp(x) = 00. 


The proof of Theorem 9 is stated in Section 8. We now assume that Algorithm 1 
does not return “VY has at least exponential complexity”. Then, Algorithm 1 
must terminate with tExp(t) # oo and vExp(x) Æ oo for all t € Trns(V) and 
x € Var. The following result states that tExp and vExp contain the precise 
exponents of the asymptotic transition and variable bounds of V: 


Theorem 10. vboundy (x) € O(N*®*?(*)) for all x € Var and tboundy (t) € 
O(N***P(4)) for allt € Trns(V). 


The upper bounds of Theorem 10 will be proved in Section 5 (Theorem 16) 
and the lower bounds in Section 6 (Corollary 20). 

We will prove in Section 7 that the exponents of the variable and transition 
bounds are bounded exponentially in the dimension of V: 


Theorem 11. We have vExp(x) < 2!" for all £ € Var and tExp(t) < 2!¥2"! 
for allt € Trns(V). 


Finally, we obtain the following corollary from Theorems 10 and 11: 


Corollary 12. Let V be a connected VASS. Then, either compy(N) € 22) or 
compy,(N) € O(N") for some computable 1 < i < 2/Var!, 
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4.1 Complexity of Algorithm 1 
In the remainder of this section we will establish the following result: 


Theorem 13. Algorithm 1 (with the below stated optimization) can be imple- 
mented in polynomial time with regard to the size of the input VASS V. 


We will argue that A) every loop iteration of Algorithm 1 only takes poly- 
nomial time, and B) that polynomially many loop iterations are sufficient (this 
only holds for the optimization of the algorithm discussed below). 

Let V be a VASS, let m = | Trns(V)| be the number of transitions of V, and 
let n = | Var| be the dimension of V. We note that |Layer(l)| < m for every layer 
l of T, because the VASSs of the nodes in the same layer are disjoint. 

A) Clearly, removing the decreasing transitions and computing the strongly 
connected components can be done in polynomial time. It remains to argue 
about constraint systems (J) and (IJ). We observe that | Varest| = |{(£,7) | 
n € layer(l — vExp(x))}| < n-m and |U| < m. Hence the size of constraint sys- 
tems (J) and (JI) is polynomial in the size of V. Moreover, constraint systems (T) 
and (JI) can be solved in PTIME as noted in Section 3. 

B) We do not a-priori have a bound on the number of iterations of the main 
loop of Algorithm 1. (Theorem 11 implies that the number of iterations is at 
most exponential; however, we do not use this result here). We will shortly state 
an improvement of Algorithm 1 that ensures that polynomially many iterations 
are sufficient. The underlying insight is that certain layers of the tree do not 
need to be constructed explicitly. This insight is stated in the lemma below: 


Lemma 14. We consider the point in time when the execution of Algorithm 1 
reaches line l := l + 1 during some loop iteration l > 1. Let RelevantLayers = 
{tExp(t) + vExp(x) | x € Var,t € Trns(V)} and let l = min | U> Ul € 
RelevantLayers}. Then, vExp(x) # i and tExp(t) # i for all x € Var, t € 
Trns(V) andl <i<l. 


We now present the optimization that achieves polynomially many loop itera- 
tions. We replace the line l := l+1 by the two lines RelevantLayers := {tExp(t)+ 
vExp(z) | x € Var,t € Trns(V)} and l := min{l’ | V > 1,1’ € RelevantLayers}. 
The effect of these two lines is that Algorithm 1 directly skips to the next rel- 
evant layer. Lemma 14, stated above, justifies this optimization: First, no new 
variable or transition bound is discovered in the intermediate layers | < i < I’. 
Second, each intermediate layer | < i < l’ has the same number of nodes as layer 
l, which are labelled by the same sub-VASSs as the nodes in l (otherwise there 
would be a transition with transition bound | < i < l’); hence, whenever needed, 
Algorithm 1 can construct a missing layer | < i < I’ on-the-fly from layer l. 

We now analyze the number of loop iterations of the optimized algorithm. We 
recall that the value of each vExp(x) and tExp(t) is changed at most once from 
oo to some value Æ co. Hence, Algorithm 1 encounters at most n -m different 
values in the set RelevantLayers = {tExp(t) + vExp(x) | x € Var,t € Trns(V)} 
during execution. Thus, the number of loop iterations is bounded by n-m. 
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5 Proof of the Upper Bound Theorem 


We begin by stating a proof principle for obtaining upper bounds. 


Proposition 15 (Bound Proof Principle). Let V be a VASS. Let U C 
Trns(V) be a subset of the transitions of V. Let w : Cfg(V) —> N and inc; : 
N > N, for every t € Trns(V) \ U, be functions such that for every trace 
¢ = (so, vo) 2 (51,1) oe ats of V with init(¢) < N we have for every 
i > 0 that 


1) si L si41 E U implies w(si, vi) > w(Si+1, Vi+1), and 
2) si E E= Trns(V)\ U implies w(s;, vi) + inci(N) > w(si41,%41)- 


We call such a function w a complexity witness and the associated inc, functions 
the increase certificates. 
Lett € U be a transition on which w decreases, i.e., we have w(s1, v1) > 


w(s2,¥2) — 1 for every step (s1, v1) x (s2,V2) of V with t = sı 2 s2. Then, 


tbound;( N) < max w(s, v) + tboundy (NV) - incy (N). 
aom Ta a a 


Further, let x € Var be a variable such that v(x) < w(s,v) for all (s,v) € 
Cfg(V). Then, 


vbound,(N) < 


< max w(s,v) + tbound, (N): inc, (N). 
enekin VOY) 2, Een) 


t'E€ Trns(V)\U 


Proof Outline of the Upper Bound Theorem. Let V be a VASS for which Algo- 
rithm 1 does not report exponential complexity. We will prove by induction on 
loop iteration | that vboundy (x) € O(N") for every x € Var with vExp(x) = l 
and that tboundy (t) € O(N’) for every t € Trns(V) with tExp(t) = l. 

We now consider some loop iteration | > 1. Let U = Un erayeri—1) Trns(VASS(77)) 
be the transitions, Var.,, be the set of extended variables and Dert € ZV" *¥ 
be the update matrix considered by Algorithm 1 during loop iteration l. Let r, z 
be some optimal solution to constraint system (JJ) computed by Algorithm 1 
during loop iteration l. The main idea for the upper bound proof is to use the 
quasi-ranking function from Lemma 3 as witness function for the Bound Proof 
Principle. In order to apply Lemma 3 we need to consider the VASS associated 
to the matrices in constraint system (I): Let Vest be the VASS over variables 
Var ext associated to update matrix Dert and flow matrix F|y. From Lemma 3 
we get that rank(r,z) : Cfg(Vert) —> N is a quasi-ranking function for Verst. We 
now need to relate V to the extended VASS Vert in order to be able to use this 
quasi-ranking function. We do so by extending valuations over Var to valuations 
over Var ert. For every state s € St(V) and valuation v : Var — N, we define the 
extended valuation ext,(v) : Varer — N by setting 


v(x), if s € St(VASS(n)), 
ext,(v)(x,7) = { $ ) Pe A 
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As a direct consequence from the definition of extended valuations, we have 
that (s,ext,(v)) € Cfg(Vext) for all (s,v) € Cfg(V), and that (s1, ext,,(1)) 2e, 
(s2, €Xts,(v2)) is a step of Vert for every step (1,11) EA (s2,v2) of V with 
81 Bie sq E U. We now define the witness function w by setting 


w(s,v) = rank(r, z)(s, exts(v)) for all (s,v) € Cfg(V). 


We immediately get from Lemma 3 that w maps configurations to the non- 
negative integers and that condition 1) of the Bound Proof Principle is satisfied. 
Indeed, we get from the first item of Lemma 3 that w(s,v) > 0 for all (s,v) € 
Cfg(V), and from the second item that w(s1,71) > w(s2,v2) for every step 
(s1, v1) EN (s2, v2) of V with t = sı 2 S2 € U; moreover, the inequality is strict if 
(DIr +F|Ẹz)(t) < 0, i.e., the witness function w decreases for transitions t with 
tExp(t) = l. It remains to establish condition 2) of the Bound Proof Principle. We 
will argue that we can find increase certificates inc,(N) € O(N'-*®*?() for all 
t € Trns(V)\U. We note that tExp(t) < l for all t € Trns(V)\U, and hence the 
induction assumption can be applied for such t. We can then derive the desired 
bounds from the Bound Proof Principle because of )ye7pns(vy\u tbound: (N) - 


inc;(V) = pare Trns(V)\U O(N) ; O(Nt-*ExP(4)) = O(N"). 


Theorem 16. vboundy (x) € O(N*®*?(*)) for all x € Var and tboundy (t) € 
O(N*®*P(4)) for allt € Trns(V). 


6 Proof of the Lower Bound Theorem 


The following lemma will allow us to consider traces Çy with init(¢y) € O(N) 
instead of init(¢y) < N when proving asymptotic lower bounds. 


Lemma 17. Let V be a VASS, lett € Trns(V) be a transition and let x € Var be 
a variable. If there are traces Cy with init(¢y) € O(N) and instance(¢y,t) > 
Nt, then tboundy(t) € Q(N*). If there are traces Cy with init(¢v) € O(N) 
that reach a final valuation v with v(x) > N*, then vboundy (x) € Q(N*). 


The lower bound proof uses the notion of a pre-path, which relaxes the notion 
of a path: A pre-path o = t,---t, is a finite sequence of transitions t; = s; 2; si. 
Note that we do not require for subsequent transitions that the end state of 
one transition is the start state of the next transition, i.e., we do not require 
si, = 8:41. We generalize notions from paths to pre-paths in the obvious way, 
e.g., we set val(7) = J icp 4) di and denote by instance(o,t), for t € Trns(V), 
the number of times ø contains the transition t. We say the pre-path a can be 
executed from valuation v, if there are valuations v; > 0 with 44, = vi + dj+1 
for all 0 < i < k and v = 1%; we further say that ø reaches valuation v’, if 
v’ = vy. We will need the following relationship between execution and traces: 
in case a pre-path o is actually a path, ø can be executed from valuation v, if 
and only if there is a trace with initial valuation v that uses the same sequence 
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of transitions as ø. Two pre-paths o = tı -+ -tg and o’ = t) ++- t} can be shuffled 

into a pre-path o” = t{---t/,), if o” is an order-preserving interleaving of o 

and o’; formally, there are injective monotone functions f : [1,k] > [1,k +] 

and g : [1,1] > [1,4 +1] with f((1,4]) A 9([1,4]) = Ø such that tho) = ti for all 

i € [1,k] and tpu) = t; for alli € [1,1]. Further, for d > 1 and pre-path ø, we 

denote by o? = go - - - o the pre-path that consists of d subsequent copies of o. 
SS — 


For the tensile of this section, we fix a VASS YV for which Algorithm 1 does 
not report exponential complexity and we fix the computed tree T and bounds 
vExp, tExp. We further need to use the solutions to constraint system (I) com- 
puted during the run of Algorithm 1: For every layer l > 1 and node 7 € layer(l), 
we fix a cycle C(7) that contains u(t) instances of every t € Trns(VASS(7)), where 
u is an optimal solution to constraint system (J) during loop iteration l. The ex- 
istence of such cycles is stated in Lemma 18 below. We note that this definition 
ensures val(C(7)) = X re Trns(vass(n)) P(t): UC). Further, for the root node +, we 
fix an arbitrary cycle C(v) that uses all transitions of V at least once. 


Lemma 18. Let u be an optimal solution to constraint system (I) during loop 
iteration l of Algorithm 1. Then there is a cycle C(n) for every n € layer(l) 
that contains exactly u(t) instances of every transition t € Trns(VASS(n)). 


Proof Outline of the Lower Bound Theorem. 
Step I) We define a pre-path 7;, for every | > 1, with the following properties: 


1) instance(7),t) > N'+! for all transitions t € Unerayer(i) Lrns(VASS(7))). 
2) val(71) = Nt X metara val(C(n)). 

3) val(™)(x) > 0 for every x € Var with vExp(x) < l. 

4) val(m)(x) > N'*? for every x € Var with vExp(x) > l+ 1. 

5) 7 is executable from some valuation v with 


a) v(x) € O(N*™P()) for x € Var with vExp(x) < l, and 
b) v(x) € O(N") for x € Var with vExp(x) >1+1. 


The difficulty in the construction of the pre-paths 7; lies in ensuring Property 5). 
The construction of the 7 proceeds along the tree T using that the cycles C(7) 
have been obtained according to solutions of constraint system (J). 

Step IT) It is now a direct consequence of Properties 3)-5) stated above that 
we can choose a sufficiently large k > 0 such that for every l > 0 the pre-path 
pi = TETK --- TË (the concatenation of k copies of each 7;, setting To = C(v)%), 
can be executed from some valuation v and reaches a valuation v’ with 


1) |v] € OW), 
2) v' (x) > kN) for all x € Var with vExp(x) < l, and 
3) v' (x) > kN'+! for all x € Var with vExp(x) >1+1. 


The above stated properties for the pre-path pias, where lmax is the maximal 
layer of T, would be sufficient to conclude the lower bound proof except that we 
need to extend the proof from pre-paths to proper paths. 


638 F. Zuleger 


Step III) In order to extend the proof from pre-paths to paths we make 
use of the concept of shuffling. For all l > 0, we will define paths y, that can be 
obtained by shuffling the pre-paths po, p1,..., pı: The path Yina, where Imax is 
the maximal layer of T, then has the desired properties and allows to conclude 
the lower bound proof with the following result: 


Theorem 19. There are traces Çy with init(¢w) € O(N) such that y ends 
in configuration (sn,vy) with vy (a) > NYP) for all variables x € Var and 
we have instance(¢y,t) > Nt for all transitions t € Trns(V). 


With Lemma 17 we get the desired lower bounds from Theorem 19: 


Corollary 20. vboundy(a) € (NP) for all x € Var and tboundy(t) € 
O(N) for allt € Trns(V). 


7 The Size of the Exponents 


For the remainder of this section, we fix a VASS VY for which Algorithm 1 does 
not report exponential complexity and we fix the computed tree T and bounds 
vExp, tExp. Additionally, we fix a vector z € Z°'(Y) for every layer l of T and a 
vector ry € ZY" for every node 7 € layer(l) as follows: Let r,z be an optimal 
solution to constraint system (JI) in iteration l+ 1 of Algorithm 1. We then set 
zı = z. For every 7 € layer(l) we define ry by setting r,(x) = r(z,7n’), where 
n’ € layer(l — vExp(a)) is the unique ancestor of 7 in layer l — vExp(x). The 
following properties are immediate from the definition: 


Proposition 21. For every layer l of T and node ņ € layer(l) we have: 


1) z > 0 andr, > 0. 


2) r7 d + z(s2) — 2(s1) < 0 for every transition sı 2 ag. Trns(VASS(n)); 
moreover, the inequality is strict for all transitions t with tExp(t) =1+1. 
3) Letn € layer(i) be a strict ancestor of n. Then, rod + 2;($2) — zi(s1) = 0 


for every transition sı 4, s2 € Trns(VASS(n)). 

4) For every x € Var with vExp(x) =1+1 we have r,(x) > 0 andr, (x) = ry (x) 
for all 7! € layer(I). 

5) For every x € Var with vExp(x) >1+1 we have r(x) = 0. 

6) For every x € Var with vExp(x) < l there is an ancestor 7! € layer(i) of n 
such that ry (x) > 0 and ry (x') =0 for all x' with vVExp(x’) > vExp(z). 


For a vector r € ZV”, we define the potential of r by setting pot(r) = 
max{vExp(x) | « € Var,r(a) Æ 0}, where we set max = 0. The motivation for 
this definition is that we have r7v € O(N?°*(")) for every valuation v reachable 
by a trace ¢ with init(¢) < N . We will now define the potential of a set 
of vectors Z C ZY", Let M be a matrix whose columns are the vectors of Z 
and whose rows are ordered according to the variable bounds, i.e., if the row 
associated to variable x’ is above the row associated to variable x, then we have 
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vExp(2’) > vExp(x). Let L be some lower triangular matrix obtained from M by 
elementary column operations. We now define pot(Z) = X column r of z POt(r), 
where we set X` Ø = 0. We note that pot(Z) is well-defined, because the value 
pot(Z) does not depend on the choice of M and L. 

We next state an upper bound on potentials. Let | > 0 and let B; = 
{vExp(z) | x € Var,vExp(x) < l} be the set of variable bounds below l. We 
set varsum(/) = 1, for B, = 0, and varsum(/) = ` Bi, otherwise. The following 
statement is a direct consequence of the definitions: 


Proposition 22. Let Z C ZV% be a set of vectors such that r(x) = 0 for all 
rEZ andx€ Var with vExp(x) >l. Then, we have pot(Z) < varsum(/ + 1). 


We define pot(7) = pot({r,, | 7 is a strict ancestor of ņn}) as the potential 
of a node 7. We note that pot(7) < varsum(/ + 1) for every node 7 € layer(l) 
by Proposition 22. Now, we are able to state the main results of this section: 


Lemma 23. Letn be a node in T. Then, every trace Ç with init(¢) < N enters 
VASS(n) at most O(NP*)) times, i.e., È contains at most O( NP) transitions 


s 4s! with s € St(VASS(n)) and s’ € St(VASS(n)). 


Lemma 24. For every layer l, we have that vExp(a) = l resp. tExp(t) = l 
implies vExp(x) < varsum(l) resp. tExp(t) < varsum(I). 


The next result follows from Lemma 24 only by arithmetic manipulations 
and induction on l: 


Lemma 25. Let! be some layer. Let k be the number of variables x E€ Var with 
vExp(z) <l. Then, varsum(l) < 2*. 


Theorem 11 is then a direct consequence of Lemma 24 and 25 (using k < | Var|). 


8 Exponential Witness 


The following lemma from [15] states a condition that is sufficient for a VASS 
to have exponential complexity”. We will use this lemma to prove Theorem 9: 


Lemma 26 (Lemma 10 of [15]). Let V be a connected VASS, let U,W be a 
partitioning of Var and let Cy,...,Cm be cycles such that a) val(C;)(x) > 0 for 
allr €U and1<i<m, and b) $; val(C;)(x) > 1 for alla € W. Then, there 
is ac > 1 and paths ny such that 1) ny can be executed from initial valuation 
N-1, 2) xy reaches a valuation v with v(x) > c for all x € W and 3) (C) 
is a sub-path of ny for eachl<i<m. 


We now outline the proof of Theorem 9: We assume that Algorithm 1 re- 
turned “VY has at least exponential complexity” in loop iteration l. According to 
Lemma 18, there are cycles C (n), for every node 7 € layer(l), that contain p(t) 
instances of every transition t € Trns(VASS(n)). One can then show that the cy- 
cles C'(7) and the sets U = {x € Var | vExp(x) < l}, W = {a € Var | vExp(x) > 
l} satisfy the requirements of Lemma 26, which establishes Theorem 9. 


? Our formalization differs from[15], but it is easy to verify that our conditions a) and 
b) are equivalent to the conditions on the cycles in the ‘iteration schemes’ of [15]. 
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